A wide day rather than a deep one — four separate threads across a few projects, each with a lesson worth keeping. I'll teach the patterns and keep the specifics generic. The through-line: make the system honest about what it's actually doing — which queries it fires, whether a service is really up, what a tool will do when you call it twice, and in what order a password change should land.
The performance thread got big enough that I split it into its own focused post; here's the short version plus the three other threads.
Thread 1 — Stop paying for queries you don't use
A sustained sweep through an app (and the package behind it) hunting wasted database work. The highlights:
- Arm an N+1 detector in dev only. A query detector wired in behind an environment check turns invisible lazy-loads into a visible to-do list. Never in production — it's a developer aid, not a runtime guard.
-
Unused eager loads are N+1s in disguise. Index screens love to
with(['creator', 'approver'])for columns a redesign later removed. Not a loop, but the same disease: queries you hydrate and throw away. Delete the eager loads with no consumer in the view. -
Memoize per-request constants. A default-connection resolver and a sidebar unread count were both recomputed on every call.
??=once, reuse for the rest of the request. -
Collapse a dashboard's stat queries. ~20
count()calls became one grouped query per table, wrapped in a short-lived cache. A dashboard can tolerate being a few seconds stale; trade live-to-the-second for cheap.
The meta-lesson: performance at this layer is mostly removal, and you lock it in with a Pest query-count assertion so nobody quietly re-adds an N+1 six months later. Full write-up in the focused post.
Thread 2 — Health checks that actually check
Here's a trap I keep seeing in "is it up?" tooling: the check verifies the record exists, or that a config row is present, and calls it green. That's not a health check — that's a config check. The service can be configured perfectly and still be unreachable.
Today's work made a service-health command verify real reachability: open an actual connection to the target and confirm the upstream answers. The clean way to do that without welding the check to one transport is a small contract:
interface TargetProbeInterface
{
public function reachable(string $host, int $port, float $timeout = 2.0): bool;
}
final class TcpTargetProbe implements TargetProbeInterface
{
public function reachable(string $host, int $port, float $timeout = 2.0): bool
{
$connection = @fsockopen($host, $port, $errno, $errstr, $timeout);
if ($connection === false) {
return false;
}
fclose($connection);
return true;
}
}
A TCP probe is the cheapest honest signal — can I open a socket to host:port within a timeout? Behind the TargetProbeInterface contract you can swap in an HTTP probe, a TLS-handshake probe, or a mock in tests, without the health command knowing the difference. That's the driver-pattern payoff: the policy (check each service) stays put while the mechanism (how you probe) is swappable.
The other subtlety: when you're checking many services, each probe should run on its own attached connection, not share one — otherwise one slow or wedged target poisons the rest of the batch. Isolation per check keeps one bad service from making everything look down.
it('reports a service as down when the upstream refuses the connection', function () {
$probe = Mockery::mock(TargetProbeInterface::class);
$probe->shouldReceive('reachable')->andReturnFalse();
app()->instance(TargetProbeInterface::class, $probe);
expect($this->checkService($someService)->status)->toBe(ServiceHealth::Down);
});
Thread 3 — Making MCP tools safe to hand to an agent
A chunk of the day went into hardening an MCP (Model Context Protocol) server — the kind of tool surface you expose so an AI agent can do real work against your system. The moment a tool can create or reply or mutate, two questions matter more than features: what happens if it's called twice, and what could leak out the response path.
Idempotency keys on the write tools. Agents retry. Networks hiccup. If "create a ticket" or "reply to a ticket" runs twice because the first response got lost, you get duplicates. An idempotency key fixes that: the caller passes a key, and the server returns the same result for a repeat key instead of doing the work again.
trait InteractsWithIdempotency
{
protected function once(string $key, Closure $work): mixed
{
return Cache::remember(
"mcp:idem:{$key}",
now()->addHours(24),
$work
);
}
}
Scrub PII on the way out. Free-text fields are where personal data hides. Anything flowing back through a tool response gets run through a scrubber, and there's a hard boundary so internal-only notes never cross into client-visible output. The principle: the response path is a trust boundary — treat everything crossing it as public.
Typed error codes instead of raw strings. An agent can't reason about "something went wrong". Give it an enum of error codes and structured output, and a caller can branch on not_found vs unauthorized vs insufficient_data deterministically:
enum McpErrorCode: string
{
case NotFound = 'not_found';
case Unauthorized = 'unauthorized';
case InsufficientData = 'insufficient_data';
case Conflict = 'conflict';
public function label(): string
{
return match ($this) {
self::NotFound => 'Resource not found',
self::Unauthorized => 'Not permitted',
self::InsufficientData => 'Not enough data to answer',
self::Conflict => 'Conflicting request',
};
}
}
That label() convention is the same one I use on every enum — the machine branches on the case, the human reads the label. Rounding out the day: exposing reference data as MCP resources (read-only context the agent can pull) and a few canned prompts for common ops flows. The shape of a good MCP server is starting to feel like the shape of a good API — typed, idempotent, with a clear public/private boundary.
Thread 4 — Getting password-reset ordering right
Identity work is unforgiving about order. Today's change reworked a password-reset flow that fans out to more than one downstream directory, and the sequencing turned out to be the whole game.
Without naming systems: when a reset has to propagate to multiple identity stores, you have to decide which one is the source of truth and write to it first, then let the rest follow. Get the order wrong and you can end up with a brief window where the stores disagree — the user's new password works in one place and not another. The fix was to make the authoritative directory the first write in the sequence, and to pull the self-service path back to only the stores that belong on it rather than every connected system.
The other half was giving this a configurable surface: a superadmin UI to set reset-password options and directory settings, instead of burying those decisions in code. Same lesson as the scheduler work from a couple of days ago — anything an operator might need to change shouldn't require a deploy. The architectural note worth keeping: when a flow touches multiple external systems, write down the ordering and the source-of-truth explicitly, ideally as a small Action with a test, because "it worked when I tried it" is not a guarantee about ordering under failure.
Wrap
Four threads, one theme: honesty. Honest queries (only fetch what you use), honest health checks (probe the real thing), honest tools (idempotent, typed, with a leak boundary), honest identity flows (explicit ordering and a source of truth). None of it is flashy. All of it is the kind of work that keeps a system trustworthy when you're not looking at it.
Top comments (0)