Domain Driven Design (DDD) is a useful tool in my toolbox. I try to avoid the dogmatic "by the book" approach that is prevalent in some engineering communities and use just-enough of the good parts to get my message across.
If you are familiar with the GoF design patterns book, which became something of a troublesome text because sometimes people would try to use as many of the book patterns as possible, because they were "best-practices", that's somehow how I feel about DDD.
It is something we can adopt in measured amounts to help us communicate better with our peers. If that's how we want to use the best-bits of DDD we need easier to digest texts in byte-size format; this is my attempt at producing one.
(I work for a logistics company, and I selected that domain for the examples, but nothing written here reflects the internals of our systems to protect both the guilty and the innocent. 🤞)
Let's look at the status-quo for a second, we exist in a world of homogeneous software, so much so that many things I might mention here will seem strange.. of course we have IDs in URLs, of course we have REST API handlers for all our Rails models, of CoUrSe we share a database with some downstream system, etc.
I should preface this text by saying that if you're not feeling pain by doing whatever you are doing, then there's no reason to stop; try to enjoy this piece of writing and maybe let it imbue you with new mental tools.
I'm speaking from a Ruby on Rails perspective, because the project that inspired me to write this is a Rails app, but I hope you'll read-through the examples here and make links to software you have worked on, or with on other tech stacks.
A software platform exists for moving Widgets from A-C, maybe a driver in a car brings them from A-B, and another driver brings them from B-C. This is modelled in the system as:
class Package < ActiveRecord::Base # the usual suspects of AR attributes has_many :deliveries end class Delivery < ActiveRecord::Base # the usual suspects of AR attributes belongs_to :package has_many :tasks end class Task < ActiveRecord::Base # the usual suspects of AR attributes belongs_to :delivery end
To give a concrete example then, moving a
Package across town will have at least one
Package, two or more
Deliveries (travel to the pick-up, travel to the drop-off), each
Delivery will have one or or more tasks (e.g scan a barcode, check someone's ID, sign for a package, get a signature for a package, etc).
Clients care about
Packages getting from A-C, but a
Driver cares about the legs of the journey, clients want to have a
Package#id to track their request by, and drivers want to see their
Delivery options, whether they're doing A-B, A-B-C, or just B-C, and how that fits in with other options on the platform after they finish the job.
In other words,
Delivery are the "most important" concepts in our system.
Importantly that system extends past the software - the language of our drivers and clients, the language of our customer support people, the language of our lawyers, HR department, and more.
That is a great introduction to two core DDD principles bounded context and ubiquitous language (hereafter BC and UL). They really go hand in hand.
Within the context of our company, there is no ambiguity about the language we use for those key things.
I dropped sneaky 💣 above.. did you catch it?
I said "...finish the job". "Jobs" are not part of our model, and if we catch the word "job" slipping into our shared vocabulary, it might be a hint that our mental model is drifting far away from our actual model. We should try hard to keep the implementation and the way people speak about the platform in sync.
Let's drill deeper on BC and UL for a moment.
Client is a word we have a real problem with in most software. The Oauth2.0 spec names "client" as a first-class term, fine. The problem comes in when one starts to have conversations like "ohh, sure, we can add another client to handle that... yeah, no not a client client, a client, like an oauth app client".
To a certain degree it can be a sign of a weak design, in my company, most
Clients are strictly either medium-volume users of a web platform that we provide, or a set of APIs for moving higher volumes. We call the latter "Integrators" and the former "Dashboard Clients". These might have been better names than one kind of polymorphic
The idea of a single kind of "Client" is a bit of a lie we tell ourselves, because both move packages around in cities, and for sure they have a lot in common, superficially, but they use our platform really differently, and would we have known that up-front, we might have made different naming choices.
Maybe five years ago someone ran
$ rails new model Client... and here we are. Entire generations of staff have come and gone using that vernacular, and it's stuck now, even if better choices might have been an option, adopting them will take a "generation".
Whether Client(logistics)⊜Client(authentication), or something else this name-space conflict is unavoidable after a certain level of complexity, and it can be partially addressed by admitting that authentication is an entirely different BC.
In other words, it's OK for client to mean one thing in the authentication system, and something entirely different in the logistics system.
This can be a great way to find boundaries to divide services.
Seeking a UL for your company can be an absolutely exhausting process, and it might seem like utter pedantry, but it's rather important.
It's also not something you can really do up-front, and constant vigilance in preventing the meaning of words being diluted is the remedy.
A bit of ambiguity might be pretty harmless when you are a small team, familiar with the code, but every ounce of "wtf" that is allowed to creep into an API is inflicting a measurable mental overhead on every person who has to interact with your system.
Now, having introduced a logistics BC, and an authentication BC another tool from DDD becomes useful; Context Mapping.
Context mapping can be as simple or as complex as you like, but for us, in our synthetic example it might mean that we can discuss the implications of agreeing to a statement like this:
All permissions live in the Authentication system (it additionally becomes the authorization system); the logistics BC is a consumer of the tokens/metadata produced by the authz/n system.
This is a map, albeit a simple one, but it serves to help us model that relationship clearly, if we know that the singular source of truth about client/driver access control is the authentication system, not the logistics system then we may may make wiser decisions.
Context Mapping comes more into play when you start to introduce more than a couple of BCs, geocoding is probably a distinct BC, addresses in that system have a subtly different meaning to the logistics system, presumably the geocoder deals with mixed-quality, mixed-accuracy addresses, and the logistics system deals with positions on the map accurate to a meter or two. They're both "Addresses", no doubt, but they are definitely different concepts.
The same is true, in my experience with invoicing, payments, pricing, and the other "fabric" of real-world interfaces. At least a notion of a
Fee/Earning are the same concept from two different perspectives depending if you are the Client paying, or the Driver being paid. If your logistics platform is some kind of broker, those terms mean opposite things twice, as you earn and are paid and are charged and must pay on both sides of the Client<=>Driver relationship.
Either way, DDD and having a clear UL can help you at least have the conversation in a structured way.
If the understanding within your engineering organization is pervasive enough, you might even avoid having to even discuss whether a feature belongs in the logistics or authz/n context, it's clear enough already that given the freedom to work, the teams will do the right thing.
- Words mean things; that's important, if the meaning (ubiquitous language) is difficult to achieve, read the room, and look for an opportunity to achieve nirvana by dividing into another bounded context.
- It's OK for different thing to have the same name, as long as they're not in the same BC.
And, back to our
Tasks for the last three pillars of DDD...
Useful definitions of these three terms are surprisingly difficult to come-by, but I think this is a pretty helpful one:
- Aggregates are defined by their identity.
- Value objects are defined by their attributes.
We didn't mention it, but of course when we move packages around a city A and B are real places, if we moved from a pair of [lat, long] coordinates, would you think about using a database table with
[ID, lat, long] columns?
I hope/think not; attaching an ID to a lat/long coordinate pair is (almost) entirely redundant. The position is identified by the lat/long uniquely (continental drift not withstanding).
Coordinate pairs are probably Value Objects (hereafter VO). It helps that they're small enough that storing them in-line on a
Package, rather than normalizing them out into a join-table in the database is viable.
It becomes less obvious if you want to allow a Client to change a pick-up location after-the-fact, or you run a geo-coding algorithm and you want to correlate and analyse input address texts with the resulting coordinates, maybe then there's a reason to treat them as an aggregate (maybe you do a fast geo-coding with a bad algorithm to get a 1km radius address in the first instance, and you follow-up by overwriting those coordinates with a better algorithm a few minutes later)
VOs are easy, low-hanging fruit, consider them the immutable parts of a system that you never need to address directly from the outside, they exist to populate properties on higher-level constructs such as
An aggregate then is defined by its identity, that means, we can change the attributes without changing the identity. We can change the pick-up location, or the assigned driver, or the weight, without changing the
Package concept, this is the same courier envelope moving across town, but it has new attributes.
This isn't strictly about mutability, the same A-B delivery by the same driver on the last Friday of every month is 12 distinct packages every year, even if the attributes are the same.
This either means that the attributes themselves aren't enough to uniquely identify the aggregate, or we have some other motive for assigning an identity, maybe this thing is important enough that we (or our users) want to be able to ask for it by "name" in the future.
We identified then three kinds of logistics related things in our data model earlier,
The first two are definitely aggregates,
Packages around, and
Drivers are assigned to
Deliveries, when all the
Deliveries for a
Package are done, the
Client's request has been fulfilled.
Tasks are more nuanced, they are immutable, they are not VOs ("get a receipt" as a task might exist 5,000 times per day on our platform), they have unique IDs, because of the way we store them in our DB.. so what are they then?
We can begin to answer that by introducing the concept of an Aggregate Root.
For the Client facing parts of our system the entrypoint to all of our capabilities is
Package, it is the root of those Client-visible aggregates and VOs.
Package may have
Delivery is an aggregate in its own right,
Task is in the tree too, and all we can say for sure, even if we're not sure what
Task is yet.
This gives us an immensely powerful tool. If we build a feature to be consumed by
Clients, the entrypoint must be a
Package and we have the freedom to mask the implementation details.
Maybe we don't even tell clients that packages may have different deliveries assigned to different drivers, and we simply report a
% completion on a
Package, it's a direct mapping to how many of the
Tasks are completed, and the client need never know the implementation details.
Maybe requesting a
Package from the API returns a specifically crafted high-density JSON document tailored to client needs omitting all mention of Deliveries and Tasks.
Deliveries are a fun example too, Drivers can enjoy being Jason Statham in the Transporter movies, asking no questions about what's in the box, to them their world revolves around moving things from X-Y, they don't need to know about a global end-to-end guaranteed-in-thirty-minutes-or-less logistics platform they can be the ultra gig economy cogs moving things on the next step of the journey, never worrying about the bigger picture.
Package concerns such as they are, status of the job as a whole, simply don't enter into the Drivers' mental model of our world of logistics.
A Client doesn't care about
Tasks; to a
Task is a check-list item on a
Delivery, the fact that we stored them in a separate table in our database is utterly irrelevant, to the outside world, a
Delivery simply has one or more things to swipe-right on when using our app.
Probably the confusion about what a
Task is arises from our framework making it trivial to name and give identity to these kinds of things, which really are just complex, somewhat stateful attributes on aggregate ([roots]) in our system.
Embracing this discovery and agreeing within the team that a
Task is a kind-of hybrid positional value object, identified in the context of its parent
Delivery by being in position 1 of 2 helps us immensely.
We know a
Task should never be a first-class concept in the app, if we can stick to that definition, maybe one day we can drop the
Tasks table all together, and find a way to serialize all of that into a
Delivery, save ourselves some
1+n database look-ups, and shrink our data (and mental) models.
There was a great deal of "what-if" in the previous section, this was supposed to be an educational post, but I asked more questions than I answered I fear.
If you take only one thing away from having read this, let it be that these are useful things to discuss when coding, try, pro-actively to avoid falling into the trap of doing whatever your framework offers because it is easy, and use these parts of DDD as a weapon to understand and confront complexity.
Adopt the principles of Ubiquitous Language(s) within Bounded Contexts, form a habit of discussing the Context Map as a way to identify where features or behaviour belong. When sketching out the contracts or APIs between your BCs and customers make deliberate decisions about whether your things are Aggregate Roots, Aggregates or simple Value Objects.
I assume if you made it to this footer, you at least had a passing interest in learning about DDD, thanks for sticking with me.
Aside the benefits of having a cleaner interface, there are some really tangible benefits to structuring your internal code around these principles too.
What happens if a package (A-C) has been picked-up, and the driver was part way through the A-B journey and runs into a problem, their car breaks, bike gets a flat, whatever.
If you model the remedial action as a fix for the
Delivery (you introduce an
A.5->B delivery after
A-B but before
B-C) directly by interfacing with the
Delivery model in the database, the
Package has no way to know that something "inside it" changed.
ActiveRecord you get some callbacks for propagating changes on
belongs_to to the owning object, for cache busting, or keeping things in sync, but using these things is an incitement of your approach, the correct fix is/was probably to add something like
Package.find(123).recover_problematic_current_delivery! which lets the
Package (or some service object, I hope) recover by manipulating its deliveries in a way that it allows.
In just about every programming language you have a concept of public/private code, exported/unexported models, but nearly all frameworks expose all your database tables as "global" objects which I think is a huge mistake.
ps. thanks for making it to the footer, I'm trying to work on my writing about DDD concepts which are extremely subjective and this is the first of what I hope will become many posts of increasing quality.