Gabriel Anhaia

Posted on May 19

Observability Without Coupling: Logging, Tracing, and Metrics Through Ports

#php #observability #architecture #opentelemetry

Book: Decoupled PHP — Clean and Hexagonal Architecture for Applications That Outlive the Framework
Also by me: LLM Observability Pocket Guide
My project: Hermes IDE | GitHub — an IDE for developers who ship with Claude Code and other AI coding tools
Me: xgabriel.com | GitHub

You've watched it happen on a codebase you liked. Someone opens a use case to "add logging." Three months later that use case knows about Monolog channels, a handler stack, a JSON formatter, and the exact shape your log pipeline expects. The constructor has eleven parameters. The unit test boots half the framework.

The domain didn't ask for any of that. It just wanted to record that an order was placed.

This is how observability becomes coupling. Logs are the easy on-ramp. Tracing arrives next because someone wanted to know why a request was slow, so they pulled the OpenTelemetry SDK into a service class. Metrics come last, usually as a StatsD client a constructor happens to know how to find. By the end the use case is observable, but it's also unportable. Swap log shippers and you rewrite the use case. Migrate from StatsD to OTel metrics and the application layer takes the blast.

The thesis of this post is small and unfashionable: observability is plumbing. Plumbing belongs behind ports. The same discipline that kept Eloquent out of your domain keeps Monolog handler stacks out of it too.

Logs, traces, and metrics are I/O

Vendor docs sell logs, traces, and metrics as if they were a new layer of the application. They aren't. They are I/O. Each one sends bytes to a process the app doesn't control: a log shipper, a trace collector, a metrics agent. Your use case has the same relationship to those processes as it has to a database or an HTTP client. Send something. Don't care who reads it.

That gives you three ports.

A Logger that takes a message and a structured context.
A Tracer that opens and closes spans.
A MetricsRecorder that records counters, gauges, and histograms.

Each one is an interface the application depends on. Each one is implemented by an adapter at the edge. The use case takes the port through its constructor. The container wires the real adapter in production and a recording fake in tests. If you've ever written a Repository interface this is the same pattern, applied to a different I/O concern.

The PSR-3 exception

PHP has a real standard for logging, and ignoring it is a mistake.

Psr\Log\LoggerInterface is the PSR-3 specification. It has lived through Monolog 1, 2, and 3. Symfony 2 through 7 ship with it. Zend Framework, ZF2, and Laminas have come and gone; the interface has not. Every major PHP library that accepts an injected logger accepts a LoggerInterface. The signature has not changed since 2012.

That's the definition of a stable abstraction. The general rule is to write your own ports rather than borrow other people's. PSR-3 is the exception. The interface lives outside any framework. It expresses logging in the language of "level, message, context". That is the language of logging, not the language of a particular vendor. Adopting it costs nothing and buys more than a decade of compatibility.

So: depend on Psr\Log\LoggerInterface in your application and domain code. Don't write a custom App\Application\Port\Logger.

Tracing and metrics are different. There is no PSR for them. The closest thing to a standard is the OpenTelemetry API, which is excellent but still moving. Writing your own thin Tracer and MetricsRecorder ports keeps the application layer stable while the ecosystem settles.

What to log at each layer

A common failure mode of "observability" is shotgun logging — every method emits a line so that "we have logs." The logs are useless because they record activity rather than events.

A more disciplined split looks like this.

The domain layer logs almost nothing directly. Domain code doesn't have a logger injected. It raises domain events (OrderPlaced, PaymentDeclined, InventoryReserved) that the application layer can choose to log. The reason is testability and clarity. A pure entity that calls $logger->info(...) has a side effect and a hidden dependency. An entity that records a Placed event states a fact and lets someone else decide what to do with it.

The application layer logs use case start, use case end, and outcome. This is where the request meets a verb. It's the natural place to record "PlaceOrder began with input X, ended with outcome Y." Include identifiers, not full entities. The input hash is fine. The full Customer object is not.

The infrastructure layer logs the things only infrastructure knows: HTTP request and response shapes (sensitive headers redacted), sampled database queries, external API calls and their latency. This is where a SQL logger or a Guzzle middleware belongs. It does not belong in a use case.

When the domain layer is silent your logs read as a story of decisions: requests in, use cases run, outcomes out, side effects fired. When the domain layer chatters you get noise that obscures the same story.

A Tracer port, in PHP

A Tracer port has a surprisingly small surface. Open a span, close it, attach attributes describing what the span did.

<?php
declare(strict_types=1);

namespace App\Application\Port;

interface Tracer
{
    public function startSpan(string $name): Span;
}

interface Span
{
    public function setAttribute(
        string $key,
        string|int|float|bool $value
    ): void;

    public function recordException(\Throwable $error): void;

    public function end(): void;
}

The application layer uses this without knowing whether the implementation is OpenTelemetry, Datadog, X-Ray, or a no-op. The use case calls startSpan, optionally adds attributes, ends the span when it's done. If something throws, the use case calls recordException before re-raising.

The OpenTelemetry adapter is a thin wrapper around the SDK's TracerInterface.

<?php
declare(strict_types=1);

namespace App\Infrastructure\Observability;

use App\Application\Port\Span;
use App\Application\Port\Tracer;
use OpenTelemetry\API\Trace\SpanInterface;
use OpenTelemetry\API\Trace\TracerInterface;
use OpenTelemetry\Context\Context;

final readonly class OtelTracer implements Tracer
{
    public function __construct(
        private TracerInterface $inner,
    ) {}

    public function startSpan(string $name): Span
    {
        $span = $this->inner
            ->spanBuilder($name)
            ->setParent(Context::getCurrent())
            ->startSpan();

        return new OtelSpan($span);
    }
}

final readonly class OtelSpan implements Span
{
    public function __construct(
        private SpanInterface $inner,
    ) {}

    public function setAttribute(
        string $key,
        string|int|float|bool $value
    ): void {
        $this->inner->setAttribute($key, $value);
    }

    public function recordException(\Throwable $error): void
    {
        $this->inner->recordException($error);
    }

    public function end(): void
    {
        $this->inner->end();
    }
}

For tests and for environments where tracing is off, a NullTracer returns a NullSpan that does nothing. Six lines of code.

<?php
declare(strict_types=1);

namespace App\Infrastructure\Observability;

use App\Application\Port\Span;
use App\Application\Port\Tracer;

final readonly class NullTracer implements Tracer
{
    public function startSpan(string $name): Span
    {
        return new NullSpan();
    }
}

final readonly class NullSpan implements Span
{
    public function setAttribute(
        string $key,
        string|int|float|bool $value
    ): void {}
    public function recordException(\Throwable $error): void {}
    public function end(): void {}
}

The trick is what you span. Spans cost. Each one is a record sent across the wire to a collector. Spanning every method call multiplies cost and clutters traces. Span the things that cross a boundary: a use case entry, a database call, an outbound HTTP request, a queue publish. That gives you a trace that reads as a story of work, instead of a flame graph of every method.

A MetricsRecorder port

Metrics are the smallest port of the three.

<?php
declare(strict_types=1);

namespace App\Application\Port;

interface MetricsRecorder
{
    /** @param array<string,string|int> $tags */
    public function counter(string $name, array $tags = []): void;

    /** @param array<string,string|int> $tags */
    public function gauge(
        string $name,
        float $value,
        array $tags = []
    ): void;

    /** @param array<string,string|int> $tags */
    public function histogram(
        string $name,
        float $value,
        array $tags = []
    ): void;
}

A counter increments. A gauge records a current value. A histogram records a distribution, typically latency. The shape covers StatsD, OpenTelemetry metrics, Prometheus push gateway, and most things in between. The tags are a flat key-value map.

The OpenTelemetry adapter wraps the SDK's MeterInterface and caches the instruments by name so you don't re-create them on every call.

<?php
declare(strict_types=1);

namespace App\Infrastructure\Observability;

use App\Application\Port\MetricsRecorder;
use OpenTelemetry\API\Metrics\MeterInterface;

final class OtelMetricsRecorder implements MetricsRecorder
{
    /** @var array<string,\OpenTelemetry\API\Metrics\CounterInterface> */
    private array $counters = [];

    /** @var array<string,\OpenTelemetry\API\Metrics\GaugeInterface> */
    private array $gauges = [];

    /** @var array<string,\OpenTelemetry\API\Metrics\HistogramInterface> */
    private array $histograms = [];

    public function __construct(
        private readonly MeterInterface $meter,
    ) {}

    public function counter(string $name, array $tags = []): void
    {
        $instrument = $this->counters[$name]
            ??= $this->meter->createCounter($name);
        $instrument->add(1, $tags);
    }

    public function gauge(
        string $name,
        float $value,
        array $tags = []
    ): void {
        $instrument = $this->gauges[$name]
            ??= $this->meter->createGauge($name);
        $instrument->record($value, $tags);
    }

    public function histogram(
        string $name,
        float $value,
        array $tags = []
    ): void {
        $instrument = $this->histograms[$name]
            ??= $this->meter->createHistogram($name);
        $instrument->record($value, $tags);
    }
}

The synchronous Gauge instrument lands in opentelemetry-php SDK 1.1+. If you are on an older release, swap the gauge() body for an ObservableGauge registered once at construction time with a callback that reads from a stored value map. The port interface stays the same.

The naming convention matters more than the implementation. orders.placed is a counter. orders.queue.depth is a gauge. orders.place.duration_ms is a histogram. Pick a convention, document it once, enforce it in code review. Whichever vendor you pick later inherits the names.

Cardinality is what blows up metrics bills. A tag with a million unique values bankrupts the time-series store. Keep tag values low-cardinality (status code, region, currency) and keep identifiers out of tags entirely.

Constructor injection, not statics

Two convenient features of mainstream PHP frameworks invite a shortcut here. Laravel exposes a Log facade. Symfony exposes a LoggerTrait. Both are convenient. Both let you write Log::info(...) or $this->logger inside any class, sourced from a global container, without declaring the dependency. Both also bypass DI.

Don't reach for either.

The rule that kept your repository out of App::make() keeps your logger out of the facade. A use case that calls Log::info(...) is a use case that cannot be unit-tested without booting the framework. A use case that takes LoggerInterface in its constructor can be tested with a NullLogger, a RecordingLogger, or any of the test doubles your suite already builds.

The same applies to tracing and metrics. If you ever find yourself reaching for a static Tracer::startSpan(), you're reaching for a facade. Inject the port instead.

A worked example: PlaceOrder

Here's the use case from the rest of the book, with three new dependencies wired in.

<?php
declare(strict_types=1);

namespace App\Application\UseCase;

use App\Application\Port\MetricsRecorder;
use App\Application\Port\Tracer;
use Psr\Log\LoggerInterface;

final readonly class PlaceOrder
{
    public function __construct(
        private OrderRepository $orders,
        private CustomerRepository $customers,
        private PaymentGateway $payments,
        private EventBus $events,
        private Clock $clock,
        private LoggerInterface $logger,
        private Tracer $tracer,
        private MetricsRecorder $metrics,
    ) {}

    public function execute(PlaceOrderInput $input): PlaceOrderOutput
    {
        $span = $this->tracer->startSpan('use_case.place_order');
        $span->setAttribute('customer.id', $input->customerId);

        $this->logger->info('PlaceOrder started', [
            'customer_id' => $input->customerId,
            'item_count'  => count($input->items),
        ]);

        $startedAt = microtime(true);

        try {
            $customer = $this->customers
                ->find(new CustomerId($input->customerId))
                ?? throw new CustomerNotFound($input->customerId);

            $orderId = OrderId::generate();
            $order = Order::place(
                $orderId,
                $customer->id,
                $this->lineItemsFrom($input),
                $this->clock->now(),
            );

            $this->payments->charge(
                $customer->id,
                $order->total(),
                $input->idempotencyKey,
            );
            $this->orders->save($order);
            $this->events->publishAll($order->releaseEvents());

            $this->metrics->counter('orders.placed', [
                'currency' => $input->currency,
            ]);
            $this->logger->info('PlaceOrder completed', [
                'customer_id' => $input->customerId,
                'order_id'    => $orderId->value,
            ]);

            $span->setAttribute('order.id', $orderId->value);

            return new PlaceOrderOutput(
                orderId: $orderId->value,
                totalCents: $order->total()->amountInMinorUnits,
                currency: $input->currency,
                status: $order->status()->value,
            );
        } catch (\Throwable $error) {
            $span->recordException($error);
            $this->metrics->counter('orders.place.failed', [
                'reason' => $error::class,
            ]);
            $this->logger->error('PlaceOrder failed', [
                'customer_id' => $input->customerId,
                'error'       => $error::class,
                'message'     => $error->getMessage(),
            ]);
            throw $error;
        } finally {
            $this->metrics->histogram(
                'orders.place.duration_ms',
                (microtime(true) - $startedAt) * 1000.0,
            );
            $span->end();
        }
    }

    private function lineItemsFrom(PlaceOrderInput $input): array
    {
        // ... line item construction
        return [];
    }
}

The use case knows nothing about Monolog. It knows nothing about OpenTelemetry. It knows about three interfaces. The container wires Monolog as the LoggerInterface, the OpenTelemetry SDK behind the Tracer, and a StatsD or OpenTelemetry adapter behind the MetricsRecorder. Swap any of them and the use case doesn't care.

Some readers will look at this and feel the use case got noisy. It did. That's the cost of operating a service in production rather than running it from a script. The noise is in one place, behind interfaces you control, instead of smeared across the domain.

The reward: tests that assert on observability

The payoff is the test you can write.

<?php
declare(strict_types=1);

namespace Tests\Application\UseCase;

use App\Application\UseCase\PlaceOrder;
use App\Infrastructure\Observability\NullTracer;
use PHPUnit\Framework\Attributes\Test;
use PHPUnit\Framework\TestCase;
use Tests\Doubles\InMemoryCustomerRepository;
use Tests\Doubles\InMemoryEventBus;
use Tests\Doubles\InMemoryOrderRepository;
use Tests\Doubles\InMemoryPaymentGateway;
use Tests\Doubles\FixedClock;
use Tests\Doubles\RecordingLogger;
use Tests\Doubles\RecordingMetrics;

final class PlaceOrderTest extends TestCase
{
    #[Test]
    public function it_logs_the_outcome_and_records_a_counter(): void
    {
        $logger  = new RecordingLogger();
        $metrics = new RecordingMetrics();

        $useCase = new PlaceOrder(
            orders:    new InMemoryOrderRepository(),
            customers: new InMemoryCustomerRepository([$this->aCustomer()]),
            payments:  new InMemoryPaymentGateway(),
            events:    new InMemoryEventBus(),
            clock:     new FixedClock($this->aFixedInstant()),
            logger:    $logger,
            tracer:    new NullTracer(),
            metrics:   $metrics,
        );

        $useCase->execute($this->anInput());

        self::assertTrue(
            $logger->hasInfoContaining('PlaceOrder completed')
        );
        self::assertSame(
            1,
            $metrics->counterValue('orders.placed')
        );
    }
}

You can assert on observability the same way you assert on behavior. The use case doesn't know it's being observed; it just is. The test doesn't need a real log file, a real trace collector, or a real metrics agent. The recording fakes give you exactly what the use case sent through the ports, and you can make claims about it.

This matches every other test in a hex codebase: in-memory adapters, recording fakes, assertions against ports. The difference is what you assert on. You assert on the application's externally visible behavior to the operations team. For a service that lives in production, that is half of what the application does.

Anti-patterns to watch for

A few patterns recur. Each one looks reasonable until production breaks.

Static facade calls from the domain. Log::info(...) in an entity. The entity is now untestable without booting the application container, and the framework has crept past the inner ring. Record a domain event and let the application layer log it.

Logging entire entities. $logger->info('order created', ['order' => $order]). The serializer recurses into the customer, the items, the payment method, the address, and you've just shipped PII to a log aggregator that wasn't supposed to hold it. Log identifiers and outcomes. ['order_id' => $order->id->value, 'total_cents' => $order->total()->amountInMinorUnits] is enough.

Tracing every operation. Every method gets a span "for completeness." The trace store fills up. The collector falls behind. The traces become unreadable. Span boundaries: use case entry, database round-trip, outbound call, message publish. Anything inside those is leaf work.

High-cardinality tags on metrics. ['customer_id' => $id] on a counter. You just turned a small time series into millions of unique series. Tags are for low-cardinality dimensions: status codes, regions, currencies.

Where to start

Full observability is a project, not a Monday afternoon. Every span correlated, every metric labeled, every log structured: the OpenTelemetry collector alone is enough infrastructure work to staff a person for a quarter. Don't try to build all of it on day one.

The narrower argument: even if you start with only one of them, get the port right. Structured logging behind PSR-3 is the cheapest place to start, and most teams already have it. The arrangement (LoggerInterface injected into use cases, structured context maps, no static facades) costs almost nothing if it's there from the first commit. Retrofitting it later is painful.

Tracing is what most teams add second, usually when "why is this endpoint sometimes slow?" becomes the recurring question. Add it when you need it. The port is small enough to add later without a rewrite, as long as use cases were never coupling to a specific tracer.

Metrics tend to come last. Logs and traces answer most questions for the first year. Metrics start to earn their keep when you want alerts ("p99 of PlaceOrder over 500ms for 5 minutes") that don't require a human reading logs.

Get the logger port right today. Defer the others until the question they answer is the question you're actually asking.

If this was useful

This pattern is one chapter in Decoupled PHP — the same port-and-adapter discipline applied to the rest of a production PHP service: testing, error handling, transactions, domain events, and the migration playbook for retrofitting it into a Laravel or Symfony codebase that grew its observability the hard way. If you want the LLM-flavoured version of the same thinking — what changes when the I/O is a model provider, a vector store, and a trace store full of token counts — the LLM Observability Pocket Guide is the companion.