Problem #1/100: Why your "clean" codebase still takes two weeks to change one thing

#programming #saas #mvp #startup

You've refactored the fat controllers. You moved logic into services. Your classes have single responsibilities, your methods are short, you use enums and value objects. The code looks clean.

And yet — every time the product owner asks to change how failed actions are retried, your team touches twelve files. Every time you swap an external provider, the estimate is two weeks. Every time someone asks "can we add logging here?", the answer involves a pull request across six services.

The codebase looks clean. The architecture is still tangled.

The rule

Business logic must not know what infrastructure surrounds it.

Your domain actions — the things your product actually does — should declare what happened. Infrastructure decides what to do about it. When domain actions also handle logging, failure persistence, retry scheduling, and notifications, you don't have separation. You have infrastructure wearing a service-class costume.

The example that passes code review

A workflow automation system. Multiple step types - send an email, generate a document, update a CRM record. Each step can fail, and failures need to be logged to a tracking table, recorded as a note on the deal, and retried.

Here's what the code looks like after a "clean up" refactor:

// Step A — send automated email
final class SendAutomatedEmail
{
    public function __construct(
        private NoteService $noteService,
        private StepFailureRepository $failureRepository,
        private MailerInterface $mailer,
    ) {}

    public function execute(Workflow $workflow, Deal $deal): void
    {
        $content = $this->mailer->render($workflow->templateId(), $deal);

        if (!$content->hasBody()) {
            $this->noteService->addWorkflowNote(
                $workflow, $deal,
                noteType: NoteType::EMAIL_STEP_FAILED,
                reason: 'Email body rendered empty.',
            );
            $this->failureRepository->persist(
                workflow: $workflow,
                step: $this,
                errorDetails: ['code' => StepErrorCode::EMPTY_BODY->value],
            );
            throw new \RuntimeException('Email body rendered empty.');
        }

        $this->mailer->send($content, $deal->contactEmail());
    }
}

// Step B — generate proposal document
final class GenerateProposal
{
    public function __construct(
        private NoteService $noteService,
        private StepFailureRepository $failureRepository,
        private DocumentGenerator $generator,
    ) {}

    public function execute(Workflow $workflow, Deal $deal): void
    {
        $result = $this->generator->generate($workflow->proposalConfig(), $deal);

        if ($result->failed()) {
            $this->noteService->addWorkflowNote(
                $workflow, $deal,
                noteType: NoteType::PROPOSAL_STEP_FAILED,
                reason: 'Document generation failed.',
            );
            $this->failureRepository->persist(
                workflow: $workflow,
                step: $this,
                errorDetails: ['code' => StepErrorCode::GENERATION_FAILED->value],
            );
            throw new \RuntimeException('Document generation failed.');
        }

        $this->generator->attach($result, $deal);
    }
}

This code passes review. Named arguments, enums for error codes, a dedicated repository, a note service. It looks like someone who read the books.

But it has the same structural problem as a fat controller. Every step repeats the same three-part failure ritual: log a note → persist a failure record → throw. The ritual is infrastructure - the business doesn't care how failures are tracked. The business cares that the email was empty or the document didn't generate. Everything after that is plumbing.

Now add a retry mechanism. You touch every step. Change the failure table schema. Every step. Add a Slack notification on failure. Every step. The "clean" code has the same change-propagation cost as the spaghetti it replaced.

After

// Step A — declares what went wrong, nothing else
final class SendAutomatedEmail
{
    public function __construct(private MailerInterface $mailer) {}

    public function execute(Workflow $workflow, Deal $deal): void
    {
        $content = $this->mailer->render($workflow->templateId(), $deal);

        if (!$content->hasBody()) {
            throw WorkflowStepFailed::because(
                step: $workflow->currentStep(),
                reason: FailureReason::EmptyEmailBody,
            );
        }

        $this->mailer->send($content, $deal->contactEmail());
    }
}

// Step B — same shape, different reason
final class GenerateProposal
{
    public function __construct(private DocumentGenerator $generator) {}

    public function execute(Workflow $workflow, Deal $deal): void
    {
        $result = $this->generator->generate($workflow->proposalConfig(), $deal);

        if ($result->failed()) {
            throw WorkflowStepFailed::because(
                step: $workflow->currentStep(),
                reason: FailureReason::DocumentGenerationFailed,
            );
        }

        $this->generator->attach($result, $deal);
    }
}

// Workflow runner — one place owns all failure infrastructure
final class WorkflowRunner
{
    public function __construct(
        private NoteService $notes,
        private StepFailureRepository $failures,
        private RetryScheduler $retries,
    ) {}

    public function run(WorkflowStep $step, Workflow $workflow, Deal $deal): void
    {
        try {
            $step->execute($workflow, $deal);
        } catch (WorkflowStepFailed $e) {
            $this->notes->addWorkflowNote($workflow, $deal, $e->reason);
            $this->failures->persist($workflow, $step, $e->details());
            $this->retries->schedule($workflow, $step);
        }
    }
}

What changed: Steps only declare domain facts — "the email body was empty." The runner decides what to do about it. Add retry? One file. Add Slack alert? One file. Change failure schema? One file. Steps never change for infrastructure reasons again.

The test

Pick a service class that "looks clean." Count its constructor dependencies. How many of those are infrastructure (repositories, loggers, notifiers, queue dispatchers) vs. domain (other business rules, value objects, domain services)? If infrastructure dependencies outnumber domain ones, the class is doing two jobs — it just doesn't look like it.

Which non-obvious case should be Problem #2? Drop the symptom in the comments — I pick the next article from what developers actually hit in the wild.

I help SaaS founders untangle these exact problems (and stop paying the compound interest on invisible architectural debt) at mvpforstartup.com.