I think branches that take time and diverge from the main branch are always problematic. There are best practices and software to resolve conflicts, and regular merges from master can ease the pain, but, in my experience, it's often time-consuming, especially when there are lots of contributors and workflows on the same app.
All the points you mentioned, like having a good naming convention and stick with it or delete merged branches, are critical.
For old branches, it occurred to me that some organizations like to keep them for various reasons I won't mention here, but if you have lots of branches, it will impact the size of repo for sure, which may impact CI/CD processes as well.
Lots of useless hundreds of MB ^^. I did not use any checkout strategy, I cleaned the repo.
You may experience such configuration when cloning famous open-source projects without a high-speed connection.
Was this because of larger files in the repository? delta-compressed gzipped text-source blobs are usually ridiculously small. But I get your point. At my last gig the monorepo was around 1.5 GB big, which was still manageable. For another big repo (which had a fair amount of binaries because of the app itself, and we had bad experience git-lfs), we used a separate "single clean history" for CICD purposes
Exactly. Do normal development on the main fork. Then when a build needs to be made, squash push to the other fork. I'm not sure exactly how we scripted cutting off the history to avoid "blowing" up the other repo, but there was a check in place.
this happened after I was less involved with that one project inside the company. I would probably have tried to rewrite some of the history, but I'm not sure how badly it would have impacted say, the JIRA <> github referencing, exicting pull requests, which is information I would have liked to keep. Maybe noone ever uses that information either, who knows.
hum, dunno how it could impact hosting providers. I would say you cannot erase everything unless you remove the entire repo, but I'm not sure. In doubt, I recommend asking questions to everyone involved in the process, including "non-technical" people, to determine what should be kept.
I started by removing lots of unused old branches and tags. I extracted all heavy files that had nothing to do with the git workflow.
Yes, filter-branch with an additional option I don't remember to ignore some commits like delete commits.
I made a full backup before all operations, as it's kinda risky even when you know what you're doing. After some time, it was possible to get rid of the backup.
UNIX mindset.
I think branches that take time and diverge from the main branch are always problematic. There are best practices and software to resolve conflicts, and regular merges from master can ease the pain, but, in my experience, it's often time-consuming, especially when there are lots of contributors and workflows on the same app.
All the points you mentioned, like having a good naming convention and stick with it or delete merged branches, are critical.
For old branches, it occurred to me that some organizations like to keep them for various reasons I won't mention here, but if you have lots of branches, it will impact the size of repo for sure, which may impact CI/CD processes as well.
Tags are really handy to contextualize.
What do you think takes time in branching?
Longlasting branches take time. You sometimes have to handle lots of conflicts especially with dependencies or recipes, possibly migrations, etc.
naming lol, sometime my developers take a lot of time to think about naming
Pretty good sign. Naming is a critical step!
Also, what were repository sizes at which CI/CD started to be impacted? In which case, what was the checkout strategy you used on the CICD builders?
Lots of useless hundreds of MB ^^. I did not use any checkout strategy, I cleaned the repo.
You may experience such configuration when cloning famous open-source projects without a high-speed connection.
Was this because of larger files in the repository? delta-compressed gzipped text-source blobs are usually ridiculously small. But I get your point. At my last gig the monorepo was around 1.5 GB big, which was still manageable. For another big repo (which had a fair amount of binaries because of the app itself, and we had bad experience git-lfs), we used a separate "single clean history" for CICD purposes
various causes but, yes PSD, etc, without LFS, also years and years of commits + some libraries added "as is" without package manager.
how? another repository with the builds?
Exactly. Do normal development on the main fork. Then when a build needs to be made, squash push to the other fork. I'm not sure exactly how we scripted cutting off the history to avoid "blowing" up the other repo, but there was a check in place.
It's a valid approach, but the only inconvenience is indeed the scripting part, as you have take all edge cases into account.
this happened after I was less involved with that one project inside the company. I would probably have tried to rewrite some of the history, but I'm not sure how badly it would have impacted say, the JIRA <> github referencing, exicting pull requests, which is information I would have liked to keep. Maybe noone ever uses that information either, who knows.
hum, dunno how it could impact hosting providers. I would say you cannot erase everything unless you remove the entire repo, but I'm not sure. In doubt, I recommend asking questions to everyone involved in the process, including "non-technical" people, to determine what should be kept.
Did you use
filter-branch
for cleanup? What strategies did you use?I started by removing lots of unused old branches and tags. I extracted all heavy files that had nothing to do with the git workflow.
Yes,
filter-branch
with an additional option I don't remember to ignore some commits like delete commits.I made a full backup before all operations, as it's kinda risky even when you know what you're doing. After some time, it was possible to get rid of the backup.
That matches things I've done in the past too. Thanks for all the answers.