Orchestrate the Core, Choreograph the Edges: How I Actually Choose Between the Two

#softwaredevelopment #designpatterns #eventdriven #microservices

An orchestra needs a conductor; a dance troupe doesn't. Most distributed workflows need both — and the skill is knowing which part is which.

There is an argument I have watched play out in design reviews for fifteen years, and it almost always generates more heat than light.

One engineer sketches a central service that calls the others in sequence — submit, validate, charge, notify — and owns the whole flow. A second engineer recoils: "That's a god-service. Make the services emit events and react to each other. Decouple everything." The first counters that nobody will be able to understand the system, that there's no single place to see what happened. Voices rise. Eventually the more senior or more stubborn person wins, and the decision — orchestration or choreography — gets made on temperament rather than on the shape of the problem.

That's the part worth fixing. Orchestration versus choreography is not a matter of taste, and it is not a binary you decide once for the whole system. It is a per-workflow judgment with a small number of inputs, and once you can name those inputs the argument mostly evaporates. I want to give you the framework I actually use, grounded in a pricing platform I built that runs both styles on purpose, in the same codebase, because each was right for a different part of the same business process.

First, let's make sure we're steelmanning both — because the design-review fight usually involves two people describing caricatures.

The two words, defined honestly

The metaphor in the names is exact, and it's worth taking literally.

Orchestration is an orchestra with a conductor. One component — a workflow engine, a state machine, a coordinator — holds the score and tells each service when to play. The control flow lives in one place. When you ask "what's the status of this transaction and what happens next," there is a single authority that knows.

Choreography is a dance troupe with no conductor on stage. Each dancer knows the routine and responds to cues — the music, the other dancers' movements — and the coordination emerges from rules each performer follows independently. Translated to systems: services publish events, other services subscribe and react, and no one is centrally in charge. The control flow is distributed across everyone's reaction to everyone else.

Neither is more "advanced" than the other. They optimize for opposite things, and that opposition is the whole decision.

The case for a conductor

Orchestration buys you one priceless thing: a single place to reason about the flow. When a process is a genuine sequence with branches — evaluate the rules; if it's clean, publish; if it's a soft violation, route to a human; if it's a hard violation, reject — that logic has to live somewhere. Put it in one state machine and you get an authoritative answer to "where is this and why," you get one place to handle timeouts and retries, and — critically — you get a natural home for compensation: when step four fails, the coordinator knows how to walk steps three, two, and one backward in order.

This is exactly why, in the pricing platform, the price-approval lifecycle is orchestrated. A proposed price change runs through a rules engine — a hard floor on margin, soft limits on the size of an increase and on pricing materially above a competitor — and a soft violation parks the proposal in a PENDING_APPROVAL state until a human acts. That is complex, stateful, human-involved decisioning. It wants one owner. Trying to express "wait, possibly for hours, for a human to approve, then continue" as a web of independent event reactions is how you build a system nobody can debug at 2 p.m., let alone 2 a.m. A workflow engine — Camunda, Temporal, or a hand-rolled state machine — is the right tool, and reaching for it here is maturity, not bureaucracy.

The risk you accept, and must manage, is that the orchestrator wants to grow. Every new requirement is tempted to land in the coordinator until it becomes the god-service the second engineer feared. Orchestration is safe only when you keep the orchestrator's job narrow: it owns sequencing and compensation, not business logic that belongs inside the services.

The case for no one being in charge

Choreography buys you the opposite priceless thing: autonomy that scales across an organization. Once a price change is published, several different parts of the business need to react — the catalog updates its read model, the search index re-indexes — and here a coordinator is actively the wrong answer. Those are different teams, different bounded contexts, different deployment cadences. If a central orchestrator had to know about every downstream consumer, every new consumer would require changing the orchestrator, and you'd have re-coupled the very teams you were trying to set free.

So in the pricing platform the post-publish propagation is choreographed. Publishing emits a PriceChangePublished event; the catalog and search contexts subscribe and react on their own schedules, knowing nothing about each other. A new consumer — analytics, a recommendation engine, a partner feed — just subscribes. Nobody changes the publisher. This is how you let an organization grow without every team's roadmap becoming a dependency on one central team's backlog. It is Conway's Law used for you instead of against you.

The price you pay is visibility. With no conductor, there is no single place that knows the end-to-end status, and a failure in one reactor doesn't naturally roll back the others. So choreography is only honest when you pay for two things explicitly: observability — you must be able to trace an event's fan-out across contexts after the fact, because you can't see it in one place live — and a compensation path, since there's no coordinator to undo things for you. In the pricing platform, if a catalog update fails after publish, the catalog context emits a PriceChangeRollbackRequested event and the proposal is choreographically walked back to ROLLED_BACK. The undo is itself a reaction, not a command from above.

How I actually choose

Strip away the religion and the decision comes down to two questions, which is why it fits on a single chart:

The vertical axis is how complex and stateful the decisioning is — does it branch, wait on humans, need timeouts, need ordered compensation? The more it does, the more it wants a conductor.

The horizontal axis is how many independent teams or contexts participate — is this one bounded flow, or a fan-out across the org? The more independent participants, the more a central coordinator becomes a coupling bottleneck.

That gives four honest answers. Complex decisioning owned by essentially one context: orchestrate — one state machine, like the price-approval lifecycle or a payment-settlement SAGA. Simple, autonomous reactions spread across many teams: choreograph — events with no coordinator, like post-publish propagation. Simple and single-owner: it doesn't matter much, so keep it simple and don't over-engineer. And the genuinely hard quadrant — complex processes that also span many independent teams: orchestrate the core and choreograph the edges. Which is precisely the pricing platform's whole design.

Why one system, deliberately, uses both

This is the punchline I most want to leave you with, because it's the thing the design-review argument gets wrong at its root: orchestration and choreography are not competing philosophies you pick between. They are tools for different layers of the same system.

The pricing platform orchestrates the part that is one team's complex, human-gated decision — the approval lifecycle — and choreographs the part that is many teams' autonomous reactions — the downstream propagation. The seam between them is the published event: orchestration's job ends when the price is approved and published; choreography's job begins there. Drawing that seam in the right place is the actual architectural skill. Put it too early and you've scattered complex decisioning across event handlers nobody can follow. Put it too late and you've dragged half the org into one team's state machine.

There's a capacity dimension to this too, and it reinforces the same seam. The approval flow is low-volume and human-paced — measured in proposals and approvals, where a state machine's overhead is irrelevant. The propagation is high-volume and bursty — a repricing event can fan tens of thousands of SKU updates outward, partitioned by SKU so each product's updates stay ordered. You would not want a single orchestrator as the chokepoint for that fan-out, and you would not want a human approval expressed as a fire-and-forget event. The performance profile and the coordination style line up on the same boundary, which is usually a sign you've drawn it correctly.

The rule I actually use

When someone asks me "orchestration or choreography?" the honest senior answer is "for which part?" — and then: orchestrate where one owner must reason about a complex, stateful flow and undo it cleanly when it breaks; choreograph where independent teams should react on their own terms without asking permission; and spend real effort placing the seam between the two. The conductor and the dancers are not in competition. A good production is both, and it knows exactly where the baton stops and the choreography takes over.

I built a complete, runnable reference implementation of all of this — the orchestrated price-approval state machine with a pluggable rules engine and human-in-the-loop, the choreographed post-publish propagation across catalog and search with event-driven compensation, and a canary rollout — in Java / Spring, with architecture decision records and a one-command local run.

Clone it and run docker compose up: https://github.com/mizbamd/pricing-orchestration

It's one of five reference implementations in an open Enterprise Platform Reference Architecture covering legacy modernization, production RAG, governed AI agents, MACH pricing, and a streaming lakehouse. I write about building platforms that are not allowed to fail — follow along.

Originally published on Medium.