βΉοΈ I've reworked this article in a different format with more content.
π§βπ» Platform Teams best practices
Yoann Moinet γ» Nov 15 '23
Bon matin π
I'm Yoann Moinet, a frenchman living in Montpellier.
In 2019, I joined Datadog and bootstrapped the Frontend Platform team.
But, what is the Frontend Platform you ask?
We're improving the developer experience and remove any pain points in the day-to-day work of all the Frontend engineers at Datadog.
We cover a lot of ground: build, tests, deployment, code health, internal tools, and more...
Working in a large scale environment, we had to come up with a charter so we don't lose focus and try to fix everything at once: that would end up being a bad experience for everyone involved.
Here is our charter:
1. π’ Workflows --not their implementation-- need to be shared company wide.
We should share similar workflows between teams and technologies, so it's easy for newcomers or someone working on an incident to quickly get up to speed even in a different repository/project.
Markers and primitives need to be identified in each workflow in order to keep them similar across implementations.
These implementations can be different as long as a team is there to own the support and follow the same identified markers/primitives.
In the case where there is already an established tool foundation, we should have, at the very least, hooks/flexibility to customize the workflow in order to align with the project's tech stack and complexity.
We have a command to deploy to staging from whichever repo you work on. For the Frontend, it's safe enough to deploy on staging at an earlier step in the pipeline, so we updated the global tool to only wait for a specific job in the CI before deploying. This way it can trigger much sooner for our Frontend. This alone reduced the staging workflow by half for our Frontend teams without impacting the global workflow itself.π Example
This command waits for the feature branch's pipeline to be 100% β
before triggering the deploy.
2. π‘ Workflows should not be created or changed unless it's tightly related to a known and documented problem.
We may want to test new technologies or read articles about new workflows and we want to try them out.
But unless we have a known issue with the related process, we should not change it.
We need to reach a consensus among impacted teams before starting any work on a new workflow or its update.
We use RFCs to have a transparent and open discussion about new technologies we want to use, or new workflows we want to implement. Having a document written down helps with the global vision of the change we're about to make. It reveals misplaced or incompatible workflows and edge cases. We're able to gather feedback from everyone involved or touched by it and refine it along the way. To make it even better and more personalised.π Example
3. π» The technology chosen for a workflow should be known and understood by the people that use it the most.
Workflows implemented for the Frontend should use JavaScript. For the Backend, Python is used, etcβ¦
This allows the people that are the most impacted by the workflow to fix and tweak it if needed.
We used to have a monorepo for both our Frontend and Backend. The infrastructure was orchestrated around a We've split the repository, and started to port everything to JS. Making it more approachable for us and the other Frontend engineers. We've created a new deployment platform for internal applications, written in JS. It is now used by more than 5 different teams, with 10 new internal applications, all of them maintained solely by Frontend engineers. π Example
Rakefile
(Ruby) triggering Bash and Python scripts. No-one from the Frontend teams wanted to dive into that.
4. π¦Ύ A workflow should be tightly related to the infrastructure it's applied to, its needs, and its context.
We should not try to implement a workflow once and expect it to cover every problem in existence across unrelated platforms.
A workflow should be implemented in the context of the infrastructure it's running on/over/inβ¦
If we aim too broad, we may end up with a cluttered workflow that's loaded with unwanted overheads, slower and more complicated than needed. This can impact multiple engineers, everytime they use it.
We implement and test it once, they use it a thousand times.
Deployments used to be handled by a Bash script, written outside of our repository, which triggered a Go script, also in another repository. This workflow and tooling was written to cover every needs, for everyone, with many conditions and edge cases. Like the We wrote our own deployment script with only what was needed for deploying our Frontend. It's in JS, hosted in our main repository, so everyone can tweak it as needed. The overall workflow didn't change, meaning that from the engineer's point of view, nothing changed. Just the process is now 10x faster and our Frontend engineers are able to change what is uploaded or not simply by changing JS code they are already familiar with.π Example
Rakefile
based infrastructure from the previous example, nobody wanted to touch these bash or go scripts. It was slow, but too difficult to really update without a risk of breaking deployment of other projects.
5. π£ Any new or updated workflow should be communicated at large.
Too much is better than not enough.
When we finally have a go at a new or an old workflow, it is very important to communicate at every step of the process to every one impacted by it.
We start with a presentation of the whole project before the first line of code is written. We explain why and how we do it, with an overall timeline (can be done through an RFC).
Then once we've started working on it, we show progress, give a clear timeline and any required actions at every step of the migration (this can be done by mail).
Finally, at completion, we explain again the new/updated workflow, but also reflecting on what went well and what could have been done better (this can be done by mail and presentations).
This helps other engineers see our work. Enforcing the idea that we're not just a support team, but that we care about their happiness and we act on improving it every day of the week.
I hope this clears things up for you and will help you manage the DX at your company.
Do you have best practices regarding the DX you'd like to share?
Psssst... π€« we're hiring.
π Photo by Brooke Cagle on Unsplash
Thank you Erik for the thorough proofreading.
Top comments (0)