Jonas Elfering for shopware

Posted on Aug 31, 2023

Benefits of separating core code from infrastructure code

#architecture #webdev #php #programming

When it comes to software design and software architecture, there are many theoretical approaches with fancy acronyms (like SOLID, CUPID, DDD, hexagonal architecture, clean architecture, …). However, it is not always easy to translate these concepts into the practical world, and it is not clear where they overlap or not. Therefore, it's important to think about the practices all academic design patterns have in common and start from first principles. One really concrete practice that is encouraged by almost all software design flavors is the separation of core and infrastructure code. This post will explore many of the benefits and challenges through a real-life, practical example, so that by the end you will have a better understanding of how and why any software system might implement this tenant of sustainable architecture.

What is core and what is infrastructure code anyways?

Every software system consists of core and infrastructure code. Infrastructure code, in simple terms, is the code that connects the main business logic (core code) to the outside world (e.g. to the web server or database). For a definition of core code, we can refer to Matthias Noback's book “Advanced Web Application Architecture”:

Rule 1:
Core code doesn’t directly depend on external systems, nor does it depend on code written for interacting with a specific type of external system.
Rule 2:
Core code doesn’t need a specific environment to run in, nor does it have dependencies that are designed to run in a specific context only.

A common symptom of weak software design is if, at the code level, it is difficult to distinguish between the two. This lack of separation reflects a web of dependencies that create a host of issues, all of which we will gradually walk through with the example, below.

For the purpose of this blog post, Infrastructure code should not be confused with “Infrastructure as Code” (the code needed to spin up the server/cloud infrastructure that the application runs on).

An example

Consider the following Symfony controller used to post a new review by a customer for a given product:

#[Route(path: '/store-api/product/{productId}/review', methods: ['POST']]
public function save(
    string $productId, 
    RequestDataBag $data, 
    SalesChannelContext $context
): NoContentResponse {
    /** @var CustomerEntity $customer */
    $customer = $context->getCustomer();
    $customerId = $customer->getId();

    if (!$data->has('name')) {
        $data->set('name', $customer->getFirstName());
    }

    if (!$data->has('lastName')) {
        $data->set('lastName', $customer->getLastName());
    }

    $data->set('customerId', $customerId);
    $data->set('productId', $productId);

    $this->validate($data, $context->getContext());

    $review = [
        'productId' => $productId,
        'customerId' => $customerId,
        'externalUser' => $data->get('name'),
        'content' => $data->get('content'),
        'points' => $data->get('points'),
        'status' => false,
    ];

    if ($data->get('id')) {
        $review['id'] = $data->get('id');
    }

    $this->repository->upsert([$review], $context->getContext());

    $this->eventDispatcher->dispatch(
        new ReviewFormEvent($productId, $customerId, $data, $context)
    );

    return new NoContentResponse();
}

The code is rather straight forward to read: the controller accepts some data, fills in some fall-back data based on the customer currently logged-in, validates the input, saves the review to the database and dispatches a domain event. In a nutshell, this controller does too much at once. Through the course of this post, we will use this example to study the problems this can cause, and gradually refactor it to illustrate how separating the core code from infrastructure code inherently minimizes these issues or resolves them entirely.

Evolve infrastructure and core independently

Over time, your understanding of the underlying domain you are working in will improve, and you naturally will update your business logic to reflect this learning. In the best case, you want to have a single place where this knowledge is stored. On the other hand, if you want to grow your business, there may be other use cases where you want to provide the same business functionality (posting reviews) in different contexts. So there is a need for the infrastructure code and the core code to be able to evolve independently.

Reusability

In the controller example above, one can easily imagine that a new business requirement may arise that a merchant should be able to bulk import reviews from an existing review system. In the current implementation, this is not easily possible, because all the code is in a single place and it is not really reusable. In this scenario, the same business rules should apply, and the same domain events should be dispatched, because for the core domain logic, it is probably not that important from which source a new review was posted. Instead of copy & pasting a bunch of code to make the new use case work, the better option would be to separate the domain logic that is responsible for posting a review from the controller that provides that functionality over an HTTP-API.

So our code may now look like this:

#[Route(path: '/store-api/product/{productId}/review', methods: ['POST'])]
public function save(
    string $productId, 
    RequestDataBag $data, 
    SalesChannelContext $context
): NoContentResponse {
    /** @var CustomerEntity $customer */
    $customer = $context->getCustomer();
    $customerId = $customer->getId();

    if (!$data->has('name')) {
        $data->set('name', $customer->getFirstName());
    }

    if (!$data->has('lastName')) {
        $data->set('lastName', $customer->getLastName());
    }

    $data->set('customerId', $customerId);
    $data->set('productId', $productId);

    $this->postProductReviewService->post($data, $context->getContext());

    return new NoContentResponse();
}

class PostProductReviewService
{
    public function post(RequestDataBag $data, Context $context): void
    {
        $this->validate($data, $context);

        $review = [
            'productId' => $data->get('productId'),
            'customerId' => $data->get('customerId'),
            'externalUser' => $data->get('name'),
            'content' => $data->get('content'),
            'points' => $data->get('points'),
            'status' => false,
        ];

        if ($data->get('id')) {
            $review['id'] = $data->get('id');
        }

        $this->repository->upsert([$review], $context);

        $this->eventDispatcher->dispatch(
            new ReviewFormEvent($productId, $customerId, $data, $context)
        );
    }
}

With that structure in place, it is now rather easy to implement the “Bulk Import Reviews” functionality, by reusing the extracted service.

Note: The new core service still expects a RequestDataBag parameter, which is not optimal but will be tackled further down.

This refactoring step also greatly helps in cases when there comes a requirement for additional validation, that a customer actually has purchased the product in the past before the review is saved. After the refactoring, there is a single place where that logic can be added and we can evolve the core code independently from the infrastructure.

Switching the underlying technology

Another case where infrastructure code may change but the core code should not be affected is when there is the need to switch the underlying technology. There is much talk about being able to (theoretically) switch the whole data storage layer for the complete application without affecting the core code. I have to admit that I have never seen that work out in real life, and I think that these kinds of replacements should not be our end goal, as such an application probably would be over-engineered. But the possibility to switch out underlying technology for specific parts of the application without having to touch the rest of the system can be a sweet spot.

Back to our code example, perhaps some requirement arises where product reviews should not be saved in the relational database, but rather posted to an external review system that should be integrated. To make this change easier, the code that actually takes care of saving the data should be extracted from the core service (as that is actually infrastructure code) and an interface should be introduced, so that the core service only relies on the interface.

The code could be refactored like this:

class PostProductReviewService
{
    public function post(RequestDataBag $data, Context $context): void
    {
        $this->validate($data, $context);

        $this->productReviewGateway->postReviewByCustomer($data, $context);

        $this->eventDispatcher->dispatch(
            new ReviewFormEvent($data->get('productId'), $data->get('customerId'), $data, $context)
        );
    }
}

interface ProductReviewGateway
{
    public function postReviewByCustomer(RequestData $data, Context $context): void;
}

class DalProductReviewDataGateway implements ProductReviewGateway
{
    public function postReviewByCustomer(ProductReviewPostData $data, ProductReviewStatus $status, Context $context): void
    {
        $review = [
            'productId' => $data->get('productId'),
            'customerId' => $data->get('customerId'),
            'externalUser' => $data->get('name'),
            'content' => $data->get('content'),
            'points' => $data->get('points'),
            'status' => false,
        ];

        if ($data->get('id')) {
            $review['id'] = $data->get('id');
        }

        $this->repository->upsert([$review], $context);
    }
}

The fact that we use the Shopware DAL to store the reviews is now an implementation detail. We can easily change that storage layer without having to touch the core business logic.

Caveat: Finding the right abstractions

When creating abstractions to hide technical decisions from the consumer of that code, a major challenge is defining the right level of abstraction.

On the one hand, it is easy to create an “abstraction” that is too specific and tightly coupled to the implementation by leaking technical details as part of the interface. This leads to a situation where even if there is an abstraction in place, it is not possible to switch the actual implementation without breaking the abstraction.

On the other hand, it might also happen that abstractions are too generic. If that is the case, then clients of the abstraction have to implement solutions for common cases that are actually better solved by the implementation of that abstraction.

So finding the right level of abstraction is always a trade-off between either too specific or too general.

Testability

Being able to easily write unit tests for your mission critical business logic is a prerequisite in order to deliver high-quality software. If the infrastructure and the core code are not separated, it is hard to write these unit tests, as basically the whole system has to be in place in order to perform them. Looking back at the original single controller example from the beginning, in order to write a test for that piece of code, a database needs to be set up, the Shopware DAL needs to be functional, which means that the Symfony container must be set up properly, and you have to fake a request in order for the tests to run at all.

We can already see that the refactoring we made in order to evolve the infrastructure separately from the core code already improves testability a lot, as we can now test the infrastructure code and the core code separately. The core code is especially easier to test, as we don’t need to have a running database and the Symfony container set up anymore; instead, we can use a Mock or dummy implementation of our ProductReviewGateway interface. The only thing that makes unit testing of our core code (the PostProductReviewService) harder than it needs to be is the dependency on an HTTP-Request in the form of the RequestDataBag argument. In order to make it easier to test, we can define a DTO that encapsulates the actual parameters that our service needs.

class ProductReviewPostData
{
    private function __construct(
        private string $productId,
        private string $customerId,
        private string $name,
        private string $content,
        private int $points,
    ) {
    }

    public static function fromRequestData(
        string $productId,
        string $customerId,
        RequestDataBag $data,
    ): self {
        return new self(
            $productId,
            $customerId,
            $data->get('name'),
            $data->get('content'),
            $data->get('points'),
        );
    }
}

This DTO object can in itself be easily unit tested, and the adjusted service is now also easier to test:

class PostProductReviewService
{
    public function post(ProductReviewPostData $data, Context $context): void
    {
        $this->validate($data, $context);

        $this->productReviewGateway->postReviewByCustomer($data, $context);

        $this->eventDispatcher->dispatch(
            new ReviewFormEvent($data->getProductId(), $data->getCustomerId(), $data, $context)
        );
    }
}

Not only is the service now easier to test, but it is also easier to consume the service API as the expected input parameters are now well defined in the DTO, and there is no need anymore to guess the parameters and their names based on the implementation. Additionally, the type system gives us some simple validation for free, like validating required and optional parameters as well as their basic types.

The controller now looks like this:

#[Route(path: '/store-api/product/{productId}/review', methods: ['POST'])]
public function save(
    string $productId, 
    RequestDataBag $data, 
    SalesChannelContext $context
): NoContentResponse {
    /** @var CustomerEntity $customer */
    $customer = $context->getCustomer();
    $customerId = $customer->getId();

    if (!$data->has('name')) {
        $data->set('name', $customer->getFirstName());
    }

    if (!$data->has('lastName')) {
        $data->set('lastName', $customer->getLastName());
    }

    $postData = ProductReviewPostData::fromRequestData($productId, $customerId, $data);

    $this->postProductReviewService->post($postData, $context->getContext());

    return new NoContentResponse();
}

The influence on complexity

In order to keep a system in a state where it can be easily changed and evolved based on current needs, it is essential to keep the complexity of the overall system low. In general, the more things the system does, the more complex it becomes, but separating core from infrastructure code will help to keep complexity minimal. To see how many different things our original code does, we can separate it into the following “concerns”:

How are the HTTP-Request parameters transformed so our application can handle them? (Infrastructure concern)
What business rules apply when a product review is posted? (Core concern)
How is a product review stored in a persistent storage? (Infrastructure concern)

Our initial code handled all three concerns at the same time. That means when you want to understand how one of those concerns is solved, the code related to the other concerns is in fact adding complexity, which makes the part of the code you are actually interested in harder to understand. By separating out the concerns and handling them separately, it is easier to focus on one of the concerns without having to care how the others are solved; the level of complexity is reduced to a minimum.

We can even improve our code a little bit further by moving the fallback handling (which gets the name from the currently logged-in customer) to where the rest of the DTO is created to have that in a single place as well. So the code might look like this:

#[Route(path: '/store-api/product/{productId}/review', methods: ['POST'])]
public function save(
    string $productId,
    RequestDataBag $data, 
    SalesChannelContext $context
): NoContentResponse {
    /** @var CustomerEntity $customer */
    $customer = $context->getCustomer();

    $postData = ProductReviewPostData::fromRequestData($productId, $customer, $data);

    $this->postProductReviewService->post($postData, $context->getContext());

    return new NoContentResponse();
}

class ProductReviewPostData
{
    private function __construct(
        private string $productId,
        private string $customerId,
        private string $name,
        private string $content,
        private int $points,
    ) {
    }

    public static function fromRequestData(
        string $productId,
        CustomerEntity $customer,
        RequestDataBag $data,
    ): self {
        $name = $data->get('name') ?? $customer->getFirstName();
        $lastName = $data->get('lastName') ?? $customer->getLastName();

        return new self(
            $productId,
            $customer->getId(),
            $name,
            $data->get('content'),
            $data->get('points'),
        );
    }
}

With that refactoring, it’s now obvious we added a lastName field to the request based on the logged-in user, in case it was not provided by the client. But we actually never used lastName anywhere when storing the review; the product review does not distinguish between first and last names, and has only one name field. So in the original code, there lay a bug in plain sight, but it was hidden by all the accidental complexity surrounding it. By separating the different concerns, the bug becomes obvious and easy to fix.

class ProductReviewPostData
{
    private function __construct(
        private string $productId,
        private string $customerId,
        private string $name,
        private string $content,
        private int $points,
    ) {
    }

    public static function fromRequestData(
        string $productId,
        CustomerEntity $customer,
        RequestDataBag $data,
    ): self {
        $name = $data->get('name') ?? ($customer->getFirstName() . ' ' . $customer->getLastName());

        return new self(
            $productId,
            $customer->getId(),
            $name,
            $data->get('content'),
            $data->get('points'),
        );
    }
}

Improvements in collaboration

Another big benefit of separating core code from infrastructure code is that it can help you improve the collaboration with the non-engineers in your team, be it QA, designers or the people from the business side. When you separate the infrastructure-related code from the core code, it is way easier to see how the business rules for your application are implemented and if they match what the domain experts would expect. In the example case, some of the most interesting parts of the code are quite hidden in the $this->validate($data, $context) call. That method currently looks like this:

public function validate(ProductReviewPostData $data, Context $context): void
{
    $definition = new DataValidationDefinition('product.create_rating');

    $definition->add('name', new NotBlank());
    $definition->add('content', new NotBlank(), new Length(['min' => 40]));
    $definition->add('points', new GreaterThanOrEqual(1), new LessThanOrEqual(5));

    $criteria = new Criteria();
    $criteria->addFilter(new EqualsFilter('customerId', $this->customerId));
    $criteria->addFilter(new EqualsFilter('productId', $this->productId));

    // ensure that the customer can only post one review per product
    $definition->add('id', new EntityNotExists([
        'entity' => 'product_review',
        'context' => $context,
        'criteria' => $criteria,
    ]));

    $this->validator->validate($data, $definition);
}

The code uses the Symfony validator component to perform the validation. The first part of the validation is rather straightforward: it performs some basic consistency checks on the data, e.g. that the name is filled, that the content of the review is at least 40 characters long and that the rating points are between one and five. With the new code structure, it would probably be best to move this part of the validation to the DTO object itself, so that it is not even possible to create a DTO that is invalid. One can still use the Symfony validator or perform those checks manually in the constructors.

But the more interesting part is the last part of the code. There it uses some Shopware DAL magic to validate that the customer has not already reviewed the same product. This is a core business rule that should be enforced, but for now it is quite hidden in the validation code. To bring it into plain sight, we want to move this check into the body of the service, as that method should read like a workflow of what happens when a review is posted. To do so, we first create a new Interface for this kind of check, as only the Infrastructure layer can provide the information we need in the service, but those two layers should be kept separate. The interface might look like this:

interface ProductReviewConstraints
{
    public function hasCustomerAlreadyReviewedProduct(
        string $productId, 
        string $customerId, 
        Context $context
    ): bool;
}

And the implementation using the Shopware DAL might look like this:

class DALProductReviewConstraints
{
    public function hasCustomerAlreadyReviewedProduct(
        string $productId, 
        string $customerId, 
        Context $context
    ): bool {
        $criteria = new Criteria();
        $criteria->addFilter(new EqualsFilter('productId', $productId));
        $criteria->addFilter(new EqualsFilter('customerId', $customerId));

        return $this->productReviewRepository->searchIds($criteria, $context)->firstId() !== null;
    }
}

The business rule that says a customer can only post a review for a product once is now more obvious in the domain flow:

class PostProductReviewService
{
    public function post(ProductReviewPostData $data, Context $context): void
    {
        if ($this->reviewConstraints->hasCustomerAlreadyReviewedProduct(
            $data->getProductId(),
            $data->getCustomerId(),
            $context
        )) {
            throw new CustomerAlreadyReviewedProductException($data->getCustomerId(), $data->getProductId());
        }

        $this->productReviewGateway->postReviewByCustomer($data, $context);

        $this->eventDispatcher->dispatch(
            new ReviewFormEvent($data->getProductId(), $data->getCustomerId(), $data, $context)
        );
    }
}

With a structure like that, it is easy to discuss with other peers if we actually implemented the requirements the right way or if there is something missing. It is easy to spot that there is no check in place whether the customer actually bought that product or not, so it can spark discussions about whether and how to add such an additional check.

Conclusion: Keep infrastructure code at the boundaries of your codebase

The code as we left it at the end of this post is by no means perfect, but it is a perfect starting point for further refactoring based on the advanced architecture patterns mentioned in the beginning (DDD, hexagonal architecture, clean architecture, …). For example, proponents of DDD probably would refactor the service in a way to encapsulate the business logic inside a rich domain model and not let it stay directly in the service itself. But especially in DDD, you also don’t want any outside dependencies in your domain model, so it’s a necessary step to first separate your core code from the infrastructure code.

By now, this example walkthrough should already explain a lot of the benefits of separating the two. It makes it easier to interact with and evolve the codebase for you and your peers even when the requirements or the technology might considerably change. Ultimately, this is a general pattern that should be applied regardless of the architecture you have in place in your application, as it lays a concrete foundation for quality, sustainability and, dare I suggest, pleasure to work within your codebase.

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

We built pgai Vectorizer to simplify embedding management for AI applications—without needing a separate database or complex infrastructure. Since launch, developers have created over 3,000 vectorizers on Timescale Cloud, with many more self-hosted.

Read full post →

DEV Community

Benefits of separating core code from infrastructure code

What is core and what is infrastructure code anyways?

An example

Evolve infrastructure and core independently

Reusability

Switching the underlying technology

Caveat: Finding the right abstractions

Testability

The influence on complexity

Improvements in collaboration

Conclusion: Keep infrastructure code at the boundaries of your codebase

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

Top comments (0)

Read next

🚀 Terminal Tricks: Essential Tips and Tricks for Developers

30 Creative Projects created with HTML, CSS and JavaScript - Part 1 #30days30projects

New AI Method Cuts Image Learning Costs by 30% While Boosting Accuracy

New AI Method Makes Language Models Smarter Through Adversarial Context Training