Discussion on: Why might a project/company use a monorepo?

madhadron profile image
Fred Ross

I teach three pros to the giant monorepo:

  1. Everything advances in lockstep. You will use the latest version of all libraries when you build. There's no questions around what versions of various libraries went into a build. If you have a commit hash, you have everything.
  2. Developers can move around the codebase easily. The tooling that worked in directory X should also work in directory Y. An improvement in tooling in one place propagates everywhere.
  3. It removes a lot of sense of ownership when your code is just part of a giant pile of code, which can stem or at least limit certain organizational pathologies.

But any of this discussion only really applies to a large repository. Here large refers to at least 500MB of source code, tens of thousands of files. Below that scale there is simply no reason to mess with submodules or anything but a single repository. The pain points you might have from one repository at that size are related to tooling around the repository, such as continuous build or testing. The source control system itself isn't a bottleneck there. Your effort is better spent fixing the tooling, because if it's painful now, it's only going to get worse later. At these scales the only reason to have multiple repositories is to control access to parts of the source code.

Once you get past that size, you pick a direction. If you choose many repositories, you have to build an orchestration system to handle all the interacting versions and getting everyone onto shared tooling can be a lot harder. If you choose a monorepo, you deal with slower and slower operations and lots of hard work on the version control system itself. Both paths work if you put in the engineering effort. Google and Facebook went monorepo. Amazon does multiple repos.

There's an underlying worldview that I need to challenge though. When people start talking about these things, the implicit idea is, "Google/Facebook/Amazon/big company do this, so it must be a good idea." These companies are outliers. Their scale has made working with the systems frankly annoying. What they do is not best practices for anyone without their problems. Giant mining machines have wheels so they can move (measured in meters per minute), but no one would think that the procedures for operating one should be emulated for an articulated bus or tractor trailer.

When someone says, "Yeah, but that's how Google does it," your first response should be, "Why do they do it that way?" In many cases the answer is, "The system got out of control in this particular dimension and it was the only way we could think of to keep it operational."