loading...

Discussion on: Why might a project/company use a monorepo?

Collapse
biros profile image
Boris Jamot ✊ /

Very interesting subject.

Everyone here seems to only see pros for using monorepo.
On my side, though I can understand the pros in a project, in a startup or in a little company, I see a lot of cons in other cases.

I work in a huge company with thousands of developers working on a wide variety of projects. Some of them are just for research while some others are for production.
I can't imagine a monorepo in a company where many languages are used.
How do you deal with your git hooks?
How do you deal with your pipeline?
How do you deal with git tags?

There are a lot of situations where monorepo make the life more complex or even impossible.

Imagine the git tree in a company with hundreds of projects!

I would love having feedback from people having used monorepo "in real life".

Collapse
david_ojeda profile image
David Ojeda

I know Google uses a monorepo, but how they manage all that complexity is something I can't even imagine...

Collapse
nictuku profile image
Yves Junqueira

Google doesn't use git, so a lot of the tools and features you might be used to don't work there. That's good, though. Developers and bots make changes to Google's code base at a very fast rate (many changes per second!), so there's no way that a central hook registry would be able to run everything necessary for every change, nor could a central pipeline take care of everything in the repository 😱.

Each directory has the equivalent of hooks (presubmit checks) that are checked using independent microservices (often maintained by independent teams) and must be run before the code is merged in head. There is also infrastructure for watching sub-trees, so tools and pipelines can be triggered when code is updated - often creating a cascade of even more changes!

You can compare such large-scale monorepos to the Internet. It gives us the illusion of having a single domain namespace, but DNS TLDs are responsible for sub-parts of it. Same for IP addresses. In Google's piper, directory trees have their own hooks, ownership and tools. It works beautifully, actually, but there's also a lot of complexity and technical debt to manage since Google isn't that new.

I worked at Google for a number of years and I'm happy to answer questions if you have any. Their source code management practices have been published publicly, so I wouldn't be disclosing anything new. Happy coding!

Thread Thread
solarmicrobe profile image
Russell Parks

The point here is having good tooling for a use case. Git isn't the right tool for a monorepo if you want those other features (and most non-small development orgs will probably need). Sorry for all the shoring-up (most, probably). Every use case is different.
This also leads into writing good tools and defining good requirements/use cases but now I'm getting ahead of myself.

Thread Thread
nictuku profile image
Yves Junqueira

Git in this case acts just like a smart file storage. Microsoft and Facebook use git for very large repos and it works for them, for example.

Thread Thread
madhadron profile image
Fred Ross

Facebook uses Mercurial. Microsoft did a lot of work in the last year or two to make git work well with big repos.

Collapse
madhadron profile image
Fred Ross

You don't use them the same way. There is a layer of tooling between your local system and the main line. All the hooks, tags, and the like are local to you. Once your commit has passed code review and automated testing, it gets integrated into the main line by the tooling. You never, ever run a merge to master by hand, nor do you have permission to.