Gabriel Anhaia

Posted on May 19

Aggregates in PHP: The Boundary Most Teams Get Wrong

#php #ddd #architecture #design

Book: Decoupled PHP — Clean and Hexagonal Architecture for Applications That Outlive the Framework
Also by me: Database Playbook: Choosing the Right Store for Every System You Build
My project: Hermes IDE | GitHub — an IDE for developers who ship with Claude Code and other AI coding tools
Me: xgabriel.com | GitHub

You join a PHP team six months into a rewrite. The lead opens the domain folder and shows you the Customer class. It has addOrder(), addInvoice(), addPaymentMethod(), and addShippingAddress(). The constructor argument list spans two screens. Loading a customer from the database pulls four joins. Saving one writes to five tables in a single transaction. The team calls this a "rich aggregate."

You ask what happens on Black Friday when two endpoints both touch the same customer.

The lead shrugs. "We have a unique constraint."

That is the boundary problem. Aggregates are the part of DDD that most teams get wrong, and PHP makes it especially easy to get wrong because Eloquent, Doctrine, and the framework tutorials all push you toward big object graphs. A too-big aggregate looks correct on a whiteboard and falls apart the moment two requests arrive at once. A too-small one leaks invariant checks into application services and queues. The boundary is the decision that matters.

This post is the boundary, in PHP 8.3, with real code.

What an aggregate actually is

An aggregate is a cluster of objects that gets treated as a single unit for changes. One object inside the cluster is the root. Code outside the aggregate is only allowed to hold references to the root. The root is the only object that enforces the aggregate's invariants, and the only object that the repository can load and save.

That is the whole definition. Three rules:

One transaction per aggregate. When you save it, every object inside is consistent at that moment.
External references point only to the root. If another aggregate needs to talk to this one, it holds the root's ID, not a pointer to a child entity.
Invariants live on the root. The root is the bouncer. Any state change goes through one of its methods.

Vaughn Vernon's Implementing Domain-Driven Design lays out four heuristics that fall out of those rules: model true invariants in a consistency boundary, design small aggregates, reference other aggregates by identity, and update other aggregates with eventual consistency. The boundary problem is really the first two.

The aggregate is the unit of transactional consistency. Everything inside changes together or nothing changes. Everything outside is eventually consistent with it. That sentence is the entire payoff of the pattern, and it's what the wrong boundary destroys.

The Order aggregate, done right

The reference shape: Order is the root, OrderLine and ShippingAddress are children that only exist as part of an order. Customer is not in this aggregate. Neither are invoices, neither are payments. They are their own aggregates and they reference the order by ID.

<?php

declare(strict_types=1);

namespace App\Domain\Ordering;

use App\Domain\Shared\Money;
use App\Domain\Customer\CustomerId;

final class Order
{
    /** @var list<OrderLine> */
    private array $lines = [];

    private function __construct(
        public readonly OrderId $id,
        public readonly CustomerId $customerId,
        private ShippingAddress $shippingAddress,
        private OrderStatus $status,
        private readonly \DateTimeImmutable $placedAt,
    ) {}

    public static function place(
        OrderId $id,
        CustomerId $customerId,
        ShippingAddress $shippingAddress,
        \DateTimeImmutable $placedAt,
    ): self {
        return new self(
            $id,
            $customerId,
            $shippingAddress,
            OrderStatus::Pending,
            $placedAt,
        );
    }
}

The constructor is private. Construction goes through place(), which is the only way an Order enters the Pending state from nothing. The root holds the CustomerId as a value, not a Customer object. That detail decides the boundary.

Now the operations that mutate the order:

    public function addLine(
        ProductId $productId,
        int $quantity,
        Money $unitPrice,
    ): void {
        if ($this->status !== OrderStatus::Pending) {
            throw new OrderAlreadyConfirmed($this->id);
        }
        if ($quantity < 1) {
            throw new InvalidQuantity($quantity);
        }
        if (count($this->lines) >= 100) {
            throw new TooManyLines($this->id);
        }

        foreach ($this->lines as $line) {
            if ($line->productId->equals($productId)) {
                $line->increase($quantity);
                return;
            }
        }

        $this->lines[] = new OrderLine(
            OrderLineId::new(),
            $productId,
            $quantity,
            $unitPrice,
        );
    }

    public function changeShippingAddress(ShippingAddress $new): void
    {
        if ($this->status !== OrderStatus::Pending) {
            throw new OrderAlreadyConfirmed($this->id);
        }
        $this->shippingAddress = $new;
    }

    public function confirm(): void
    {
        if ($this->status !== OrderStatus::Pending) {
            throw new OrderAlreadyConfirmed($this->id);
        }
        if ($this->lines === []) {
            throw new CannotConfirmEmptyOrder($this->id);
        }
        $this->status = OrderStatus::Confirmed;
    }

    public function total(): Money
    {
        $total = Money::zero('EUR');
        foreach ($this->lines as $line) {
            $total = $total->add($line->subtotal());
        }
        return $total;
    }
}

Three things happen on every state change. The root checks its own status. The root applies the invariants. The root mutates child entities only through its own methods. Nothing outside this class can put the order into an inconsistent state, because nothing outside this class has a reference to the line array or to the status.

The child entity is simpler:

<?php

declare(strict_types=1);

namespace App\Domain\Ordering;

use App\Domain\Shared\Money;

final class OrderLine
{
    public function __construct(
        public readonly OrderLineId $id,
        public readonly ProductId $productId,
        private int $quantity,
        private readonly Money $unitPrice,
    ) {}

    public function increase(int $by): void
    {
        if ($by < 1) {
            throw new InvalidQuantity($by);
        }
        $this->quantity += $by;
    }

    public function subtotal(): Money
    {
        return $this->unitPrice->multiply($this->quantity);
    }
}

OrderLine has identity, so it is an entity, not a value object. But its identity only matters inside the order. No code anywhere else in the application loads an OrderLine directly. There is no OrderLineRepository. If you want to change a line you load the order, call a method on it, then save it back. That is the rule, and breaking it is how the boundary collapses.

ShippingAddress here is a value object: replace it whole, never mutate it in place. That keeps the order's snapshot of "where I'm going" honest at every state transition.

Where the boundary goes wrong

The team in the opening built a single Customer aggregate that owned every order the customer had ever placed, every invoice, every payment method. The reasoning sounded good at the design meeting: "A customer has many orders. The customer is the natural root."

Here is what that looks like in code:

final class Customer
{
    /** @var list<Order> */
    private array $orders = [];

    /** @var list<Invoice> */
    private array $invoices = [];

    /** @var list<PaymentMethod> */
    private array $paymentMethods = [];

    public function placeOrder(
        ShippingAddress $address,
        \DateTimeImmutable $at,
    ): Order {
        $order = Order::place(
            OrderId::new(),
            $this->id,
            $address,
            $at,
        );
        $this->orders[] = $order;
        return $order;
    }

    public function payInvoice(InvoiceId $id, Money $amount): void
    {
        foreach ($this->invoices as $invoice) {
            if ($invoice->id->equals($id)) {
                $invoice->pay($amount);
                return;
            }
        }
        throw new InvoiceNotFound($id);
    }
}

It compiles. It runs. The tests pass on a freshly seeded database. Then production happens.

Failure 1: the load is unbounded

The Doctrine mapping fetches orders, invoices, and payment methods every time the customer aggregate is loaded. A customer with 3 years of history has 240 orders, 240 invoices, and 4 payment methods. Every endpoint that touches the customer pulls 484 rows. Adding a single shipping address turns into a SELECT that scans four tables.

You can lazy-load. Doctrine and Eloquent both support it. But lazy-loading inside an aggregate is its own footgun. Invariant checks that need to look at sibling entities now silently trigger queries deep inside domain code, and the n+1 lives behind a method call that looks like pure logic. The right answer is to make the aggregate small enough that you can afford to load it eagerly.

Failure 2: write contention

Two requests arrive at the same customer at the same instant. One adds a line to a pending order. The other pays an old invoice. Both load the customer aggregate. Both write to the customer's row (or, depending on the ORM, lock it with SELECT ... FOR UPDATE). One wins, one retries.

This is the killer. The boundary defines the lock. A bigger boundary means more requests fighting over the same lock. A Customer aggregate that owns orders and invoices means every order placement and every invoice payment serializes on the customer. On Black Friday, an account with high traffic is a hot row.

When the boundary is correct — Order is one aggregate, Invoice is another — order placement and invoice payment don't touch each other. The customer can be modified by N requests in parallel because nothing is locking it. A single hot customer that used to bottleneck at a few writes per second now scales with the order and invoice tables independently.

Failure 3: the invariant doesn't actually live there

"A customer's lifetime spend equals the sum of all paid invoices." That sounds like a customer invariant. It isn't, because it doesn't need to hold inside a single transaction. If you pay invoice A and read the customer's lifetime spend two seconds later, the answer being eventually consistent is fine — nobody is making business decisions on a sub-second-stale total. That invariant belongs in a read model, not in the write-side aggregate.

The test for "is this a real invariant in this aggregate" is: does the business break if this value is stale by 100ms? If no, it's eventual consistency, and it does not belong inside the boundary.

Reference the other aggregate by ID

Once you accept that Customer, Order, and Invoice are three aggregates, the cross-aggregate code becomes obvious. Pass IDs, not objects.

<?php

declare(strict_types=1);

namespace App\Application\Ordering;

use App\Domain\Customer\CustomerRepository;
use App\Domain\Customer\CustomerId;
use App\Domain\Ordering\Order;
use App\Domain\Ordering\OrderId;
use App\Domain\Ordering\OrderRepository;
use App\Domain\Ordering\ShippingAddress;

final readonly class PlaceOrder
{
    public function __construct(
        private CustomerRepository $customers,
        private OrderRepository $orders,
        private \Psr\Clock\ClockInterface $clock,
    ) {}

    public function __invoke(
        CustomerId $customerId,
        ShippingAddress $shippingAddress,
    ): OrderId {
        $customer = $this->customers->get($customerId);
        $customer->assertCanPlaceOrders();

        $order = Order::place(
            OrderId::new(),
            $customerId,
            $shippingAddress,
            $this->clock->now(),
        );
        $this->orders->save($order);

        return $order->id;
    }
}

The use case loads the customer aggregate to check eligibility — assertCanPlaceOrders() throws if the customer is blocked, suspended, or has no verified email. Then it creates the order aggregate, separately. Two repository calls, two transactions if you want, or one outer transaction if the use case demands it. The aggregates themselves don't know about each other beyond the ID.

A subtle thing: the customer aggregate stays small. It owns identity, profile, status, contact details — the things that change rarely and need cross-field consistency. Orders and invoices live elsewhere. The customer aggregate's assertCanPlaceOrders() method reads its own state and decides. It does not iterate over orders, because it doesn't hold any.

The rule of thumb

Three questions before you put two things in the same aggregate:

Do they need to be consistent in the same transaction? Not "would it be nice." Not "the report looks wrong if not." Does the business break if they're 200ms out of sync? If no, two aggregates.
Will they be modified by the same use cases at the same rates? An order's lines change as the cart is built. The customer's name doesn't. Different change cadences mean different locks mean different aggregates.
Does the size grow unbounded? A customer's order history grows forever. An order's line count is bounded by the cart. Unbounded children are a tell that the boundary is wrong.

If you can answer those three questions and the answer points at one aggregate, great — keep them together. If any answer points elsewhere, you have at least two aggregates with an ID reference between them.

What you do with this

Open your domain folder. Find every class with more than three collections inside it. For each collection, ask the three questions. The collections that fail "needs same-transaction consistency" come out. Replace them with an ID on the other side and a separate repository. Write a use case that loads both aggregates when the workflow needs both, and let the application layer be the seam.

The boundary is the load-bearing decision of an aggregate-style domain model. Get it right and repositories, use cases, transactions, and events fall out mechanically. Get it wrong and every concurrency bug and slow endpoint traces back to one too-greedy class that thought it was the center of the world.

Order is the center of order. Customer is the center of customer. They are not the same thing.

If this was useful

Aggregates, ports, use cases, and the seams between them are the spine of Decoupled PHP. The book walks the same shape from a single-aggregate service up to a multi-context application with HTTP, queue, and CLI adapters, and it spends a full chapter on how to redraw boundaries inside a legacy Laravel or Symfony codebase without a freeze week.