Andrea Sprega

Posted on Feb 18, 2024 • Edited on Feb 20, 2024

Entity repositories for humans: part 1

#php #symfony #webdev #programming

TL;DR: this article describes a pattern for creating entity repositories that we use at Slope, but first it explains other strategies that we previously explored and discarded. If you're only interested in the outcome, you can look directly at the GitHub repository.

Fetching entities from a database is a necessity that pretty much every web application has. In case of fairly structured applications, this often happens through the use of repositories (if you have never heard of the repository pattern this is a pretty good introduction).

If you use Doctrine ORM, most likely you have already used its repositories. In case you're not familiar with them, you can get an idea by looking at the examples in the documentation.

NOTE: as Slope is built using Symfony, the approach described here is based on Doctrine. If you use Laravel, however, it should be possible to use the same approach also with Eloquent, with a few adaptations, by building some abstraction around its query builders.

The starting point

As the business domain becomes more complicated, properties and relationships between entities increase. With them, the queries that you need to fulfill the use cases of your application become increasingly more complex.
With necessities increasing over time, you will probably need to switch from default Doctrine repositories to custom ones. The moment you cross that line, a world of different possibilities/implementations opens up.

At Slope we constantly strive for the best possible developer experience. This case made no exception: we wanted an approach that was elegant, easy to use but at the same time flexible and that left room for optimizations when our DBMS (PostgreSQL) needed them.

Let's go step by step: before presenting our current solution, it makes sense to mention the various approaches that we have tried previously.

Approach #1: standard Doctrine repositories with magic methods

This is most likely the first approach that you discover once you start working with Doctrine.
By default, Doctrine provides repository classes in which you don't have to implement any method. You can just use their dynamic interfaces that define several findBy*, findOneBy*, countBy* methods based on entity metadata, i.e. it "adds" one method for each property and one for each association defined on its entity. Note that these methods are not statically defined, but implemented dynamically by Doctrine via a __call method. The official documentation describes them here.

You can find this approach implemented in branch approach-1.

Pros:

You can use them out of the box.

Cons:

It is only possible to filter by the properties defined by the entity, so it only works with rather simple models.
Combining more than one filter is very cumbersome.
The fact that repositories use magic methods with associative arrays as parameters makes this approach quite "obscure"; the resulting DX is pretty bad, as you don't have autocompletion: discovering and debugging these methods is not exactly a breeze.

Magic is nice. Just not in your repositories.

Approach #2: standard Doctrine repositories using Criteria

The same repositories described above can also be queried via Criteria objects. These objects represent specific filters (and/or combinations of them) and can be passed to the repository in order to extract matching entities. You can find more information about them here.

This approach is demonstrated in branch approach-2.

Pros:

Criteria objects can be used interchangeably with repositories and entity collections for filtering.
Criteria are highly composable and use much less "magic" than the previous approach.
Client classes that use this approach are forced to know the internal details of the entity about how attributes and relationships are defined (because they work directly with the attribute and association names). Any optimizations, workarounds to Doctrine limitations or denormalizations can no longer be kept “secret”, and it will be necessary to modify all clients if, for example, the name of an attribute or an association changes.

Cons:

The approach is super verbose and impractical. The more filters you need, the worse the situation becomes.
Criteria objects are very cumbersome to mock in unit tests.

Approach #3: standard Doctrine repositories with custom `findBy*` methods

At some point we decided that complexity was too much, and we started implementing our findBy*/countBy* methods, one for each combination of filters we needed to fulfill our use cases.
Each of these methods constructs its own query builder, applies its filtering logic and produces a specific result. These methods can be parametrizable as for the relationship that have to be joined in advance, plus all the various possible orderings.

branch approach-3 shows this last intermediate approach.

Pros:

Very simple to use for client classes, because they just need a simple method call with just a few parameters (often just one).
Mocking these methods in unit tests is very simple, a one-liner in most cases.

Cons:

Every method is extremely specific and does not scale well when complexity increases. We often ended up with dozens of findBy* methods per repository, due to the combinatorial explosion of needed filters.
As every method composes its own query builder, this pattern inevitably causes code duplication in repositories, especially if you are not diligent enough to factor common filtering logic across multiple methods.
During development, when you have to add a new use case, you are often in doubt when deciding whether to complicate an existing method maybe with additional parameters, or to write a new method from scratch.

There must be a better way!

None of the approaches described above felt "good": the cons were always prevailing on the pros. We wanted something better, ideally based on this wishlist:

Repositories should be classes with "clean" interfaces (i.e. no Doctrine specific public methods)
They should be dead simple to use (and to mock), ideally with just a single method call (no chaining allowed!) and a single parameter.
They should guarantee complete shielding of storage/implementation details of entities: the query builder must never leave the repositories, and Doctrine Criteria are not accepted. This way, when changing internal details of an entity, you just need to change code inside its repository and not in other classe.
You should not need to implement a new method every time you add a new filter to an existing entity. Adding parameters to existing methods is not acceptable either, because it complicates the interfaces and scales poorly as the filter combinations grow.
There should be a handy way of prefetching relationships (to avoid the infamous N+1 problem).
Possibility of paginating results without changing how client classes query the repository.
Support for composition via subqueries.

Over time and after various contributions by many team members, we got to define a pretty good pattern that checked all the items in the checklist above.

We are happy to share the results, hoping that it can be useful for others as well. For this purpose, we created a GitHub repository with a living example of it, that will evolve when we publish more article on this series.

Before you ask: the example in our repository is based on a simple domain for managing a video rental shop. You know, we're just a bunch of nostalgic millennials.

Disclaimer: Running a movie rental shop in 2024 may not be the best business idea.

Introducing the “entity query”

The fundamental idea behind this solution is that each repository has a "companion" class, which we will call entity query.
Entity queries are in fact simple DTOs that must be instantiated providing filters by classes that need to fetch entities from repositories.

They are convenient to create because every filter has its named parameter in the query constructor. All of them are nullable: this means that regardless of the combination of filters you want to apply, you can always build an entity query inline and specify only the filters that you are interested in, in any order you want.

Conceptually speaking, all the filters are applied in logical AND with each other to produce the result set. This is a pretty strong simplification, but it works well for us because we still manage to cover most of our typical use cases with it.

I like to say that a few lines of code are better than a thousand words, so here are excerpts of the Rental entity and its entity query present in the master branch of the GitHub repository:

#[ORM\Entity]
class Rental
{
    #[ORM\Id]
    #[ORM\GeneratedValue(strategy: 'NONE')]
    #[ORM\Column('id', type: 'guid')]
    public readonly string $id;

    #[ORM\ManyToOne(targetEntity: Movie::class)]
    #[ORM\JoinColumn('movie_id')]
    public readonly Movie $movie;

    #[ORM\ManyToOne(targetEntity: Customer::class)]
    #[ORM\JoinColumn('customer_id')]
    public readonly Customer $customer;

    #[ORM\Column('rent_date', type: 'datetimetz_immutable')]
    public readonly \DateTimeImmutable $rentDate;

    #[ORM\Column('return_date', type: 'datetimetz_immutable', nullable: true)]
    private ?\DateTimeImmutable $returnDate = null;

    // ...
}

class RentalQuery extends AbstractEntityQuery
{
    public const ORDER_BY_RENT_DATE_DESC = 'RENT_DATE_DESC';

    public function __construct(
        public ?Movie $movie = null,
        public ?Customer $customer = null,
        public ?\DateTimeImmutable $rentDateGreaterThan = null,
        public ?\DateTimeImmutable $rentDateLessThanOrEqualTo = null,
        public ?bool $returned = null,
        array $orderBy = [],
    ) {
        parent::__construct($orderBy);
    }
}

And here are a few examples of how this entity query can be used:

$repository->executeForCount(
    new RentalQuery(customer: $aCustomer)
);

$repository->executeForManyResults(
    new RentalQuery(
        rentDateGreaterThan: new \DateTimeImmutable('8 days ago'),
        returned: false,
    )
);

$repository->executeForFirstOrNullResult(
    new RentalQuery(
        movie: $aMovie,
        orderBy: [RentalQuery::ORDER_BY_RENT_DATE_DESC]
    )
);

The public interface of an entity repository

Repositories feature a set of method that work with entity queries plus some other methods that perform unfiltered counts and fetches, as well as fetches by id.
Note that the same entity query can be used both to get the results hydrated as entities and to get their counts. Here is an excerpt from the public interface:

/**
 * Base class for entity repositories.
 * @template E of object
 * @template Q of AbstractEntityQuery
 */
abstract class AbstractEntityRepository
{
    /**
     * Returns a count of all entities.
     */
    public function countAll(): int;

    /**
     * Executes the entity query returning the count of matching results.
     *
     * @param Q $entityQuery
     */
    public function executeForCount(AbstractEntityQuery $query): int;

    /**
     * Executes the entity query returning only the first result (or null if no results found).
     *
     * @param Q $entityQuery
     * @return ?E
     */
    public function executeForFirstOrNullResult(AbstractEntityQuery $entityQuery): ?object;

    /**
     * Executes the entity query returning many results (optionally limited using $limit parameter).
     *
     * @param Q $entityQuery
     * @return E[]
     */
    public function executeForManyResults(AbstractEntityQuery $entityQuery, ?int $limit = null): array;

    /**
     * Executes the entity query expecting zero or one result.
     *
     * @param Q $entityQuery
     * @return ?E
     * @throws NonUniqueResultException If more than one entity matches the query.
     */
    public function executeForOneOrNullResult(AbstractEntityQuery $entityQuery): ?object;

    /**
     * Executes the entity query expecting exactly one result.
     *
     * @param Q $entityQuery
     * @return E
     * @throws NonUniqueResultException If more than one entity matches the query.
     * @throws NoResultException If no entity matching the query is found.
     */
    public function executeForSingleResult(AbstractEntityQuery $entityQuery): object;

    /**
     * Returns all entities.
     *
     * @return E[]
     */
    public function findAll(): array;

    /**
     * Returns a single entity by ID.
     *
     * @return E
     * @throws \Doctrine\ORM\NoResultException
     * @throws \Doctrine\ORM\NonUniqueResultException
     */
    public function findByID(string $id): object;

    /**
     * Returns multiple entities by their IDs.
     *
     * @param string[] $ids
     * @return E[]
     */
    public function findByIDs(array $ids): array;
}

It is important to note that concrete repositories do not need to define new public methods to fetch entities: they just need to implement their own logic to configure the Doctrine query builder based on the provided entity query. Here's the code fragment that does this for RentalRepository:

/**
 * @extends AbstractEntityRepository<Rental, RentalQuery>
 */
class RentalRepository extends AbstractEntityRepository
{
    // ...

    protected function configureBuilderFromEntityQuery(QueryBuilder $builder, AbstractEntityQuery $query): QueryBuilder
    {
        /* Validation */

        if ($query->rentDateLessThanOrEqualTo !== null) {
            Assertion::lessThan(
                $query->rentDateGreaterThan,
                $query->rentDateLessThanOrEqualTo,
                'rentDateGreaterThan cannot be greater than rentDateLessThanOrEqualTo.'
            );
        }

        /* Filtering */

        if ($query->movie !== null) {
            $builder->andWhere('e.movie = :movie')->setParameter('movie', $query->movie);
        }

        if ($query->customer !== null) {
            $builder->andWhere('e.customer = :customer')->setParameter('customer', $query->customer);
        }

        if ($query->rentDateLessThanOrEqualTo !== null) {
            $builder->andWhere('e.rentDate <= :rentDateLessThanOrEqualTo')
                ->setParameter('rentDateLessThanOrEqualTo', $query->rentDateLessThanOrEqualTo);
        }

        if ($query->rentDateGreaterThan !== null) {
            $builder->andWhere('e.rentDate > :rentDateGreaterThan')
                ->setParameter('rentDateGreaterThan', $query->rentDateGreaterThan);
        }

        if ($query->returned === true) {
            $builder->andWhere('e.returnDate IS NOT NULL');
        } elseif ($query->returned === false) {
            $builder->andWhere('e.returnDate IS NULL');
        }

        /* Ordering */

        foreach ($query->orderBy as $orderBy) {
            switch ($orderBy) {
                case RentalQuery::ORDER_BY_RENT_DATE_DESC:
                    $builder->addOrderBy('e.rentDate', 'DESC');
                    break;
                default:
                    throw new \Exception("Unknown order by: $orderBy");
            }
        }


        return $builder;
    }
}

Limitations

The greatest limitation of this approach is that, as already mentioned, all filters work with each other only in an AND fashion. So, if you set two properties on an entity query, only entities that match both filters will be returned.

However, it is always possible to overcome this limitation by defining "virtual" properties that correspond to more complex filters to be implemented in the repository (for example, two properties in OR). We will discuss this in detail in a future post.

Another limitation is that if you need to filter on a property using N different operators, you must define N different filters on the same property.
Looking at the code above, $rentDateGreaterThan and $rentDateLessOrEqualTo are an example of this limitation.
We believe that this is a reasonable price to pay for not having to complicate the construction of the query too much. In case this is a problem, though, it is still possible to implement "operator" classes to use in entity queries. We are not currently doing this, but as this is a possible evolution we might talk about this in a future post.

There’s more to come!

The approach just described here is just a simplified version of the one we actually use in production at Slope.

There are still many interesting missing pieces, which we planned to add and describe in future articles. In the next episodes we will talk about:

Filters with virtual properties
Filters based on associated entities
Support for multi-tenancy
Prefetching associations
Pagination of results
Subqueries

Make sure to star the GitHub repository to not miss the updates we will make as new articles in this series get published.

We are eager to hear your opinions. Feel free to contact us (engineering <at> slope.it), via Github or by leaving comments below!

DEV Community

Entity repositories for humans: part 1

The starting point

Approach #1: standard Doctrine repositories with magic methods

Approach #2: standard Doctrine repositories using Criteria

Approach #3: standard Doctrine repositories with custom `findBy*` methods

There must be a better way!

Introducing the “entity query”

The public interface of an entity repository

Limitations

There’s more to come!

Top comments (0)

Read next

Revolutionary Two-Layer Framework Makes Agent-Based Models More Realistic and Adaptive

Finally got some time to play with the new JSONata and Variables support for Step Functions, and I have to say, it is massive improvement. Check out my latest blog post, where I walk through a simple example of how easy it is to handle pagination now

React 19: New hook useActionState

What I'd Like to Learn in 2025

The starting point

Approach #1: standard Doctrine repositories with magic methods

Approach #2: standard Doctrine repositories using Criteria

Approach #3: standard Doctrine repositories with custom findBy* methods

There must be a better way!

Introducing the “entity query”

The public interface of an entity repository

Limitations

There’s more to come!

Read next

Revolutionary Two-Layer Framework Makes Agent-Based Models More Realistic and Adaptive

Finally got some time to play with the new JSONata and Variables support for Step Functions, and I have to say, it is massive improvement. Check out my latest blog post, where I walk through a simple example of how easy it is to handle pagination now

React 19: New hook useActionState

What I'd Like to Learn in 2025

Approach #3: standard Doctrine repositories with custom `findBy*` methods