Monorepo or Polyrepo? 3 Critical Consequences of Your CI/CD Choice

#career #architecture #cicd #monorepo

Over the years, I've used various repository structures in many software projects, from my small side products to large enterprise ERP systems. Whenever I faced the "monorepo or polyrepo" question at the beginning of each project, I realized that this decision was not just about where the code resided, but had broad implications, from CI/CD processes to team motivation, and even the company's long-term technology strategy. In this post, I want to share my experiences regarding the 3 critical consequences of this choice, specifically concerning CI/CD processes.

When making this decision, I always asked myself "what would be more suitable for the project and team dynamics," rather than "what would be better for me." Because what was ideal for one project could be a complete nightmare for another. Making the right choice meant preventing many operational pains before they even began.

My Perspective on Monorepo and Polyrepo: An Attempt at Definition

I define a monorepo as a structure where all my projects, services, and libraries live in a single Git repository. This was an approach I used in the early days when I kept both the backend and frontend of my Android spam blocker application in the same repo. Everything was in one place, changes were easy to track, and managing dependencies was less complex. This provided great convenience, especially in small teams or projects where I worked alone.

On the other hand, polyrepo means using a separate Git repository for each service, library, or independent application. When developing the ERP for a manufacturing company, we had hundreds of different services and modules; we chose to keep each of them in a separate repo. This allowed teams to work more autonomously, manage their own release cycles, and progress independently. However, this independence brought its own set of challenges, especially when it came to CI/CD processes.

ℹ️ The Monorepo Difference

A monorepo is not just about using a single repository. The important thing is that multiple, independently deployable projects live within this repository and are managed with a common versioning and build system. Otherwise, using a single repo for a single large project is not considered a monorepo; it's just a standard project.

For one of my side products, I initially started with a monorepo-like structure. The frontend (Vue) and backend (FastAPI) code were in the same repo. A simple git push triggering the CI/CD for both sides and deploying everything together was sufficient for me. However, as the project grew and I started adding different modules, rebuilding and retesting the entire system with every commit began to waste time. It was at that point that I started to see some of the challenges alongside the advantages of a monorepo.

Impact on CI/CD Processes: Build and Test Optimization

The biggest CI/CD challenge in a monorepo was rebuilding and retesting the entire system from scratch with every commit. Especially in large projects, this meant pipelines that could take hours. In the ERP of a manufacturing company, a full build and test cycle could exceed 3 hours. This delayed feedback to developers and reduced iteration speed. At first, I thought "it's fine," but after a while, I realized this situation was unsustainable.

As a solution, I tried to apply the logic of "affected" commands from monorepo tools (like Nx, Bazel) to my own CI/CD pipelines. Using the git diff command, I aimed to detect only the changed code parts and their dependent modules, and then build and test only those. For example, I would get the files from the last commit with git diff --name-only HEAD~1 HEAD and, with the help of a script, find out which services were affected.

# Example CI/CD step that detects only changed services
CHANGED_FILES=$(git diff --name-only HEAD~1 HEAD)
AFFECTED_SERVICES=""

if echo "$CHANGED_FILES" | grep -q "services/order/"; then
    AFFECTED_SERVICES+="order "
fi
if echo "$CHANGED_FILES" | grep -q "services/inventory/"; then
    AFFECTED_SERVICES+="inventory "
fi
# ... continues for other services

if [ -n "$AFFECTED_SERVICES" ]; then
    echo "Changed services: $AFFECTED_SERVICES"
    # Build and test only the affected services
    for SERVICE in $AFFECTED_SERVICES; do
        echo "Building and testing $SERVICE..."
        # build_service.sh and test_service.sh scripts
        # were written to process only the relevant service.
        ./scripts/build_service.sh "$SERVICE"
        ./scripts/test_service.sh "$SERVICE"
    done
else
    echo "No services affected, CI/CD step skipped."
fi

This approach reduced monorepo build times by up to 70%, bringing a 3-hour process down to about 45 minutes. In a polyrepo, the situation was different. Since each repo had its own CI/CD pipeline, the pipeline was triggered only when the code in that specific repo changed. This provided fast feedback for each service. However, if a dependency changed (e.g., a common library), all repos using that library had to be rebuilt manually or through a complex triggering mechanism. This added a separate management overhead. I had done a more detailed analysis on [related: CI/CD pipeline optimization].

Deployment and Rollback Chaos: Versioning and Dependencies

In a monorepo, deployment is typically managed with "atomic deploy" or "independent deploy" strategies. Atomic deploy means deploying the entire system as a single unit. This simplifies things, especially in small to medium-sized systems or when microservices are highly interdependent. All services switch to the new version simultaneously, which reduces compatibility issues. However, redeploying the entire system for a minor change in one service can be risky and slow. In my ERP project, we had to deploy the entire system for a small reporting error in March 2024, which caused a 15-minute outage.

Independent deploy, even within a monorepo, allows each service to be deployed at its own pace. This means deploying only the changed services, as in the git diff example above. This approach increases deployment speed and reduces risk. However, managing version compatibility between services becomes more difficult. A new version of service A might break another service C, which depends on an older version of B. This reminds me of the difficulties I experienced with [related: API versioning strategies].

In a polyrepo, since each service is deployed independently, versioning and dependency management could turn into a complete nightmare. A bank's internal platform had over 100 microservices. Each of these services had its own version number and dependencies. While version 1.2.0 of Service A depended on version 3.5.1 of Service B and version 2.0.0 of Service C, version 4.0.0 of Service D only worked with version 3.5.0 of Service B. Tracking this dependency matrix, knowing which service was compatible with which version, created huge chaos in deployment and rollback scenarios.

⚠️ Versioning Nightmare

In a polyrepo, Semantic Versioning (SemVer) and good documentation are essential for managing the dependency matrix. Otherwise, finding the answer to "why isn't this service working?" could take days. Especially with updates containing breaking changes, all dependent services may need to be updated simultaneously or compatibility layers may need to be created.

The situation was similar for rollbacks. If you're doing atomic deploys in a monorepo, reverting to a single commit is relatively easy. However, if you're doing independent deploys, you need to correctly identify which version of which service to roll back. In a polyrepo, when rolling back one service, you might have to check if other dependent services are compatible and, if necessary, roll them back as well. This was the last thing you wanted when dealing with a production error at 3:00 AM.

Team Structure and Developer Experience: Fast Iteration or Independence?

The most obvious impact of a monorepo on developer experience was code sharing and the power of "find-and-replace." Since common libraries, UI components, or configuration files were in one place, it was possible to see how a change would affect the entire system and easily make the necessary updates. When I changed an API contract in the backend of my mobile application, I could update the frontend code in the same commit. This provided fast iteration and consistency.

However, monorepo also had some disadvantages. Everyone working on the same codebase could increase merge conflicts. Also, it was difficult to master the entire codebase in a large monorepo. Sometimes a change in an unrelated module could cause a problem in an unexpected place. In a client project, in February 2025, it took us 2 days to figure out how a database schema change affected an unrelated reporting service.

Polyrepo provided teams with more autonomy and independence. Each team owned the repo for their services and could progress at their own pace. This was invaluable, especially in environments with hundreds of developers and different product teams, as I saw on a large e-commerce site. Each team could choose its own technology, CI/CD pipeline, and release cycle (within certain limits, of course). This was perfect for teams who said, "don't interfere with our work, we'll go our own way."

💡 Team Culture and Repo Choice

The choice of repo structure directly influences team culture and communication style. Monorepo encourages more collaboration and a centralized approach, while polyrepo supports more independent and autonomous teams. When making this decision, you should consider your current team structure and the culture you desire.

But this independence came at a cost. Sharing common code became difficult. When a library needed an update, it had to be updated separately in all repos that used it. This created a lot of manual workload and potential for inconsistency. When a common security patch (e.g., a vulnerability in a JWT library) came out, we had to scan and update hundreds of repos. This brought back the panic we experienced in December 2023 for a CVE in an oauth2 library.

Security and Compliance: Common Vulnerabilities and Isolation

From a security perspective, I see that both monorepo and polyrepo have their own advantages and disadvantages. In a monorepo, since the management of common dependencies is centralized, when a security vulnerability (like the famous log4j vulnerability) was detected, it was relatively easier to see all affected modules in one place and apply the patch. Being able to update the dependency version across the entire system with a single PR was a huge advantage for quick action. When the log4j crisis erupted at the end of 2021, applying and deploying the fix in a monorepo project took us only a few hours.

However, in a monorepo, the potential for a security vulnerability to affect the entire system is also higher. A vulnerability in a commonly used library can spread to all projects. Furthermore, having modules with different security levels in the same repo can create complexity in terms of authorization and access control. For example, having code for a module dealing with sensitive financial data and a public marketing page in the same repo requires stricter access policies for the repo. As I previously mentioned in [related: Software security best practices], minimizing such risks is essential.

Polyrepo, on the other hand, provided better isolation. Since each service had its own repo, a security vulnerability in one service generally did not directly affect other services. This helped limit the "blast radius" (the potential impact of an error or attack). In a bank's internal platform, code related to critical financial services was kept in completely separate repos from the code for less sensitive internal tools. This allowed us to apply different security policies and access controls for each repo.

🔥 Security Tracking in Polyrepo

In a polyrepo structure, separately scanning each repo for security and tracking its dependencies requires significant effort. Using automated security scanning tools (dependency scanning, SAST/DAST) and integrating them into the CI/CD pipeline is vital. Otherwise, it's very easy to overlook outdated, vulnerable dependencies among hundreds of repos.

However, in a polyrepo, security tracking and patch application created a significant operational burden. I had to set up a centralized system to scan hundreds of repos individually, identify outdated dependencies, and apply patches. In April 2024, I had to manually check over 50 repos for a CVE in a protobuf library. This situation once again demonstrated how critical an automated dependency management and security scanning system is.

My Preference and Lessons Learned: Flexibility Based on the Situation

For me, there is no clear "always this" answer to the question of monorepo versus polyrepo. The dynamics of each project, team size, project complexity, and even company culture influence this decision. However, through years of experience, I have developed some general principles.

For side products I started with small teams or developed alone, I always preferred a monorepo initially. Fast iteration, easy code sharing, and the simplicity of managing a single CI/CD pipeline were invaluable to me. Keeping the backend and frontend of my financial calculators in the same repo allowed me to develop and deploy both sides simultaneously when adding a new feature. This can be a logical starting point, especially for those who want to quickly launch products with a startup mindset.

However, as the project grew, the number of team members increased, and we started to divide into different business domains, I began to see the benefits of transitioning to a polyrepo. Especially in the ERP of a manufacturing company, when modules belonging to completely different domains like production planning, supply chain, and finance needed their own independent lifecycles, a polyrepo became more suitable. Each team independently developing and deploying their own services increased overall efficiency. Of course, this transition also had its costs and difficulties; separating all dependencies, setting up new CI/CD pipelines, and defining a versioning strategy was a months-long effort.

ℹ️ Hybrid Approaches

In some cases, hybrid approaches like "polyrepo within a monorepo" can also be used. For example, common libraries and some critical services might reside in the main repo, while more independent services are kept in separate repos. Or, structures can be built within a monorepo where different services trigger their own CI/CD pipelines but still benefit from centralized management. This reminds me of my work on [related: integration patterns in microservice architecture].

One of the most important lessons learned was that this decision is not a one-time, irreversible choice. The repo structure may need to evolve with the project's evolution. The key is to understand the advantages and disadvantages of both structures and make the most appropriate decision considering the project's current state and future goals. In my experience, starting small and breaking things apart as they grow has generally been less painful than diving into a large polyrepo mess from the start.

Conclusion

The choice between monorepo and polyrepo is more than just a technical preference; it's a strategic decision that deeply affects the efficiency of CI/CD processes, the complexity of deployment strategies, and how teams work. A monorepo offers fast iteration and centralized management ease initially, but as it scales, it can present challenges with build times and dependency management. A polyrepo provides team autonomy and isolation but can complicate the dependency matrix and security tracking.

When making this decision, I always considered the project's current size, team structure, future scaling needs, and security requirements. Both approaches have their own beauties and difficulties. The important thing is to understand these trade-offs well and find the most pragmatic solution for your project. Remember, this is not an "all or nothing" decision, but an evolving architectural choice.