There are many posts about Git branching strategies out there, but they're either light on details or heavy on complexity. My aim here is to define the simplest possible production-grade Git branching strategy for an analytics engineering team. Ideally, nothing should be able to be removed and nothing needs to be added. If you disagree, leave a comment down below!
The simplest feature branching flow
The absolute simplest feature branching flow is described very well in this official dbt article. There is a main
branch off of which you create your feature branches. The main branch corresponds to the production schema, and pull requests from feature branches ideally go to temporary schemas. Only modified tables should run with state deferral to main (aka slim CI) in these temporary schemas.
Adding more environments
Starting with one main branch for production and doing all your development and UAT in feature branches/pull requests will probably work just fine for small to medium sized organizations. Larger organizations may need additional environments. However, that doesn't mean that a radically different Git branching flow is needed! Instead, our current flow can be extended with Git tags.
Adding one pre-production environment
If we want to have one pre-production environment, we can still utilize the main branch for both the production and the pre-production environments by tagging commits that are ready for production release.
This way, the latest commit in main is always pushed to pre-production environment #1, whatever you want to call it. When the team feels confident that the change can be pushed to production, they tag that commit with a production release version number, and a separate CI process that watches for tags then pushes the changes to the production environment.
Now you have your temporary schemas, one for each pull request, the 'bleeding edge' main that points to the pre-production environment, and the production environment that only gets updated when a new version is tagged in main.
Adding a second pre-production environment
For some organizations, one pre-production environment is not enough, and they insist on two. This is still easy to do! We just have to utilize release candidate tags for the new pre-production environment.
Suppose our pre-pre-production environment is named STAGING, and our pre-production environment is named UAT. STAGING corresponds to the latest commit in the main branch - that's the 'bleeding edge'. UAT corresponds to the latest release candidate tag on the main branch. In semantic versioning, this would be achieved by adding the suffix -rc.N
to the name of the release it's targeting. For example, if our goal is to create production release v12.0.0
, our UAT environment commits would be tagged v12.0.0-rc.1
, then v12.0.0-rc.2
, and so on. Suppose on v12.0.0-rc.5
we finally feel confident enough to push to production. We would then add the tag v12.0.0
to the same commit, which would constitute a full release and then be automatically deployed to production.
Adding even more pre-production environments???
If you depart from semantic versioning, you could in theory create a tagging convention that would let you create an unlimited number of environments from your tags. Do you have a use case for a pre-pre-pre-pre-production environment? Let me know in the comments! :D
The catch: You have to manually update the environment or write your own CI
The CI that's built into dbt Cloud can support the basic feature branching flow out of the box, but it doesn't support git tag release strategies. This pushes folks unnecessarily into creating multiple branches for multiple environments in situations where simple tags would have served them just fine.
One option is to manually update the environment's "custom branch" in dbt Cloud settings every time there's a new release.
The other option is to do the same thing, but automatically via the API as soon as a commit in the main branch is tagged. There's an existing project that can be used as a reference. I'll update the post if I get around to creating an automated process myself.
Top comments (0)