DEV Community

Cover image for Why I joined Dagger
Alex Suraci
Alex Suraci

Posted on • Edited on

Why I joined Dagger

For the past 8 years I’ve been trying to solve CI/CD, “once and for all.”

  • once: The core design of the CI/CD system should approach “done.” Nothing is forever with software, but it should feel pretty darn close.
  • for all: Everyone should be able to use this CI/CD toolchain no matter the type of product they’re building or team they’re working with.

My first attempt was Concourse, a CI/CD system that scheduled pipelines written in declarative YAML. Choosing YAML for Concourse made it for all, but it was definitely not once; we had to constantly rework its declarative model to handle more use cases. As time went on I started to wonder if the final frontier was actually a “language for CI/CD.”

In 2021 I started on Bass, a scripting language for running commands and aggressively caching them. The theory of Bass is that 99% of CI/CD is just running commands, so having a better language for running commands should go a long way towards "solving CI/CD." It’s been a lot of fun to make. But even if Bass is able to solve CI/CD once, it definitely won’t be for all. Lots of folks won’t want to learn a whole new language just to get a CI/CD workflow going.

Last year I joined Dagger after realizing we were trying to solve all of the same problems (escaping YAML hell, unifying CI and dev workflows, minimizing CI overhead – more on all that later). We were even using the same underlying technology (Buildkit) and running into all of the same challenges.

The fit was kind of uncanny.

A man yelling

Even further, Dagger appeared to be speed-running the experience I had with Concourse and Bass by pivoting the developer experience from declarative config to real code – but in any language.

Dagger’s new approach seemed like it could finally be once and for all.

Problems to solve

I’ve talked a lot about “solving CI/CD” but haven’t pointed out any of the problems worth solving. Let’s go over a few of them.

Going beyond local maxima

Bash, Make, YAML, and Dockerfiles arguably form the Mt. Rushmore of DevOps glue. Each of these tools are designed to do one thing, and they do it well, with a low barrier to entry.

“But…”

  • Bash scripts start as puppies but grow into haunted beasts. Greybeards recite incantations like set -e -u -x; set -o pipefail and foo "${bar}" to keep the ghosts at bay, but sometimes the ghosts win.

  • Makefiles are rooted in beautiful simplicity: “when file X is newer than file Y, build Y.” But they aren’t portable; the host machine needs to have the right version of every development dependency. They only attempt to solve part of the problem.

  • YAML is great for writing simple config files, but it is not great for writing large CI/CD pipelines and dynamic content. Templating YAML gives you more power, but the extra abstraction step also adds a layer of confusion: being told there’s a syntax error on line 1234 doesn’t help when your template doesn’t even have a line 1234.

  • Dockerfiles are super easy to write, but the friction of maintaining, building, pushing, and pulling them leads teams to favor monolithic CI images to minimize overhead. This pattern causes problems when teams share dependencies and one team bumps a dependency to a version that other teams don't yet support. Dockerfile syntax is also somewhat limited, with no loops or conditionals, and using templates has the same issues as with templating YAML. More on this in Replacing your Dockerfile with Go code.

No tool is perfect. Stretching any tool beyond its core competency generally results in a poor developer experience. I expect to continue to use each of these tools, but the hope is for Dagger to relieve some of the strain from the shoulders of these giants:

  • Dagger runs commands, like Bash and Make – but its commands are containerized, making them portable, and you can write Dagger code in a programming language of your choice.

  • Dagger is used to write CI/CD pipelines, as is common with YAML – but Dagger code is real code, not pseudocode. Instead of using templates you can refactor code and run it locally.

  • Dagger pipelines form a DAG, like Make – but its nodes are containers, not files. Each container brings its own dependencies, making the pipeline portable.

  • Dagger builds containers, like Dockerfiles – but it lets you treat them like code, rather than artifacts. You can refactor them however you want, and you don't need to push them to a registry.

Reproducible builds

Reproducible builds are the holy grail of CI/CD. They guarantee that every run will produce bit-for-bit identical outputs. Achieving this feat means accounting for every single input – including time itself! Another adjective for this type of build is hermetic.

Hermetic builds unlock a lot of cool ideas. They allow for aggressive caching and deduping concurrent runs, dramatically improving efficiency. They allow for artifact recipes to be published alongside the artifact itself, acting as a sort of “proof” of its checksum. Most importantly, hermetic builds prevent surprises in the path to production. Hermetic builds are also great for keeping teams on the same page: you know that everyone is seeing the same thing and using exactly the same CI pipeline.

As great as they are, hermetic builds come with a cost. Chasing down checksums and accounting for every dependency (no more apt-get) is a lot of work for the first day of an experimental project.

There are already really solid tools for reproducible builds (Nix and Bazel come to mind) but they tend to be a bit opinionated. Hermetic builds are absolutely the best practice to follow, but making them a requirement tends to alienate users.

Dagger lets you reap the benefits of hermetic builds, but it doesn’t force you to commit. Everything is cached, which encourages you to explicitly represent any inputs that should bust the cache, but you can also just throw in a timestamp. The mechanics are there; how you leverage them is up to you.

Decentralizing CI/CD

Ok, that’s clickbait. Part of the appeal of CI/CD is to centralize. Having a source of truth allows the team to agree on the state of the path to production.

One risk with centralization is ending up with a non-reproducible CI system. Hand-tweaked runtime environments, hand-configured pipelines and boxes checked by former employees. Tribal knowledge. Outdated documentation, if there’s any at all.

So: it’s OK to centralize, but there should be nothing special about the central instance. Any developer should be able to understand and run the project’s CI on their local machine, and easily bootstrap a new central instance in the event of disaster.

I also shouldn’t need to do this routine:

$ vim ci.yml
$ git commit -am "please work"
$ git push
Enter fullscreen mode Exit fullscreen mode

Engineers test their product changes before committing so they don’t pollute their repository with broken code. Why should CI code be any different?

Dagger pipelines are portable; they can run anywhere that has a Dagger engine, whether that’s your local dev machine, a dedicated server, or a hosted CI platform.

Scaling with complexity

This point is a bit nebulous compared to the others, but it feels worth including nonetheless.

Automation needs tend to grow over time. Projects increase scope, target more platforms, and learn from more mistakes.

A CI/CD stack should grow gracefully to meet the demands of its project. New methods of testing and shipping should be easy to add. Refactoring pipelines and enforcing sweeping policy changes should feel as frictionless as possible.

Every developer should feel capable of fully understanding, maintaining, and optimizing their CI/CD stack. New developers should be able to join a mature project and easily figure out how everything works.

Dagger has the developer do what they do best: read and write code. Dagger also tries to leverage the same handful of concepts as much as possible so there’s not too much to keep in your head. All of this is so that developers can focus on their product more than learning to work with its CI pipeline.

The frontier ahead

Alright, so… that’s the theory. Some of these points are foundational, and some of them are aspirational. It’s up to us to bring it all to practice, and we’re dependent on your constant feedback to make sure we’re building the right thing.

I’d love to know whether these points resonate with you, or if I’m completely missing the mark. Let me know in the comments below, or in Discord! You can find me in the various Dagger channels as vito#9876.

Top comments (1)

Collapse
 
alikhajeh profile image
Ali Khajeh-Hosseini

@vito the points you raise resonate a lot with me! My CI/CD maintenance experience started with Jenkins ~10 years ago and migrating users to TravisCI felt good as that travis.yml just worked and was readable.

In the last two years, I've been working on github.com/infracost/infracost and developing CI/CD integrations for that tool started with a simple bash script that grew arms and legs. The journey then moved into creating CI/CD integrations for a range of popular CI/CD systems. The next phase that I'm seeing with more mature tools is direct source control integration, using their app plugins (e.g. GitHub App integration). It's interesting as vendors literally write CI/CD code/logic to handle GitHub App events etc.

So that leaves me wondering why these dev tools go through this transition, and if having dagger could mean that vendors don't need to go towards building source control integrations?

Happy to chat over a Zoom call if you prefer to dive deeper! My email is ali.hosseini@infracost.io