Manuel Kanetscheider

Posted on May 22, 2023

A Python Implementation of the Unit of Work and Repository Design Pattern using SQLModel

#python #designpatterns #tutorial #programming

In this blogpost I would like to introduce two design patterns:

Repository pattern
Unit of Work pattern

So let's get started!

Repository pattern

The Repository pattern is a Domain-Driven Design pattern intended to keep persistence concerns outside of the system's domain model.

This means that the repository pattern represents our data layer and is a strict separation between domain and data model. A concrete repository gives access to an entity, an entity consists of one or more related tables:

Source: Design the infrastructure persistence layer - Define one repository per aggregate

A repository encapsulates all CRUD operations, in addition the repository also acts as an interface between domain and data model. This means that the domain and data model are only loosely coupled via well-defined interfaces.

This approach offers several advantages:

❇️ Separation of domain and data model.
❇️ Enabling of DDD and TDD. For each repository you can easily create so called fake repositories, which in turn can be used to create unit tests.
❇️ Centralization of queries, all query functions are defined and maintained in one place.

However, this pattern offers not only advantages:

🔴 Introduces another layer and increases the complexity.
🔴 Leads, in dependence of in the implementation, to redudant code because eventually domain model and data model must be maintained twice and additionally a mapping logic between domain and data model and vice versa is required.

Unit of Work pattern

This is a pattern that is commonly used with the repository pattern. To understand this pattern, we must first define the term "unit of work":

A unit of work refers to a single transaction that involves multiple insert, update, or delete operations.

In other words, all database operations should be handled in a single database transaction. This means that the repositories do not commit their respective transactions themselves, the database session of the repositories is managed by the unit of work instance, therefore the unit of work is also responsible for committing or rolling back the transaction.

This is especially important when multiple repositories are involved: the unit of work instance ensures that either all or none of the database operations are committed, keeping our data consistent.

Implementation

A special challenge in the implementation is that the domain model must not be dependent on the data layer, otherwise the loose coupling is no longer given. For Java and C# there are various tutorials on how to implement this pattern, something that almost all these implementations have in common is that the domain and data model are developed separately, so there are separate domain model and separate data classes.

Thus, the repository pattern is also responsible for the mapping between domain and data model. A consequence is that there are inevitably certain redudances and an additional mapping logic. In Python you would have basically the same problem but that is exactly where SQLModel comes to the rescue.

fastapi / sqlmodel

SQL databases in Python, designed for simplicity, compatibility, and robustness.

SQLModel, SQL databases in Python, designed for simplicity, compatibility, and robustness.

Documentation: https://sqlmodel.tiangolo.com

Source Code: https://github.com/fastapi/sqlmodel

SQLModel is a library for interacting with SQL databases from Python code, with Python objects. It is designed to be intuitive, easy to use, highly compatible, and robust.

SQLModel is based on Python type annotations, and powered by Pydantic and SQLAlchemy.

The key features are:

Intuitive to write: Great editor support. Completion everywhere. Less time debugging. Designed to be easy to use and learn. Less time reading docs.
Easy to use: It has sensible defaults and does a lot of work underneath to simplify the code you write.
Compatible: It is designed to be compatible with FastAPI, Pydantic, and SQLAlchemy.
Extensible: You have all the power of SQLAlchemy and Pydantic underneath.
Short: Minimize code duplication. A single type annotation does a lot of work. No…

View on GitHub

So what can SQLModel do for us? SQLModel is from the same developer who developed FastAPI, so this library works very well with FastAPI, but can also be used for other projects.

In FastAPI domain models are developed with Pydantic, Pydantic is responsible for data validation and serialization or deseralization of the data. FastAPI has no built-in ORM framework, a popular ORM framework is SQLAlchemy. What SQLModel does now is the combination of Pydantic and SQLAlchemy, so the models become domain and data model at the same time.

Thanks to this particular feature, a major pain point can be eliminated, namely the need to maintain domain and data models, as well as the maintenance and development of a mapping logic between domain and data model.

Overview

First, let's get a brief overview of the software we are going to create. For the tables I will follow the tutorials of SQLModel a little bit. Let's take a look at the class diagram first:

This probably looks a bit overwhelming now, but stick with me, let's go through it step by step:

GenericRepository: This is the abstract base class of a repository, in which all general repository functions are defined.
GenericSqlRepository: Inherits from GenericRepository. This class wraps SQLModel and as the name suggests acts as a generic SQL repository. All concrete repositories will be inherited from this class.
HeroReposityBase and TeamRepositoryBase: These classes are abstract classes that inherit from GenericRepository, where specific functionalities of the respective entities (hero and team) are defined. There are two approaches to the repository pattern: the generic and the "one repository per entity" approach. I personally prefer the second one, because you have the possibility to define specific functionalities, but this is a matter of preference or depends on the requirements.
HeroRepository and TeamRepository: These repositories are concrete repositories, they inherit from GenericSqlRepository and from the respective entity base repositories (HeroReposityBase or TeamRepositoryBase), in which the entity specific functions are defined. The entity specific functions are implemented in these repositories.
UnitOfWorkBase: Abstract context manager which defines the basic functions of the Unit of Work interface.
UnitOfWork: Concrete Unit of Work, inherits from the base class and wraps and manages the database session.

Generally speaking, all of the above classes can be divided into two categories: abstract and concrete classes. For example, if you want to create fake classes for unit tests, you would have to create fake classes that inherit from all the "base" classes, i.e. from the following:

GenericRepository
HeroReposityBase
TeamRepositoryBase
UnitOfWorkBase

All other classes are concrete classes and cannot be used for unit tests, unless you create integration tests in which you use e.g. an in-memory SQLite database.

In this blogpost I will not cover the creation of unit tests, but as I said before only the abstract classes need to be derived. This concept is especially strong when working with a dependency injection system, so in unit tests the dependencies can be easily injected using DI, creating a loose coupling between domain and data layer.

Let's write some code!

Defining the models

As you can see here I have defined my own base model, the settings chosen here are similar to what you would use in a FastAPI project. In the config you basically only set that the models can be imported and exported in a JSON compliant style.

The concrete tables inherit from the base model. Contrary to the official documentation, I defined the columns via the property sa_column, passing an SQLALchemy column object as a parameter. This allows a more precise column definition, so the column name and the exact data type can be specified accurately. I personally prefer this approach, but of course you can also define the tables as described in the official SQLModel documentation.

ORM

Engine and session factory

This file contains the code needed to connect to a database, i.e. the creation of a SQLModel engine and session maker function. The latter returns a function which in turn returns a new database session whenever it is called.

Repositories

Generic Repository

As before, this is the base repository class which defines all basic functions (= CRUD).

SQL Repository

This repository is a generic SQL repository. The constructor takes as arguments a database session and a SQLModel type.

Here all methods are implemented and two new methods for the creation of the GET and LIST SQL statements. By introducing these two methods you could customize the SQL statements for GET and LIST in the derived repositories.

Base Entity Repositories

These repositories are the abstract base classes of the respective repositories. These classes are completely optional, but offer the advantage that entity-specific functions can be defined here, which should not be available for all repositories.

Entity repositories

These are now concrete repositories, here one could overload the functions of the general SQL repository, in addition the abstract methods of the respective base repositories are implemented here.

Service Layer

The Unit of Work functionalities are not directly part of the ORM framework, but is a service that connects everything together.

Unit of Work

The service unit of work is responsible for the handling of the session and thus guarantees that everything runs as a single unit of work.

This service is a context manager that calls the session_maker aka session factory function each time in the enter method and thus creates a new session each time. This session is passed on to the respective repositories, thereby ensuring that all operations run within one transaction.

The rollback functions are always called at the end of the context manager. If the commit function was called before, the rollback function has no effect. If the commit function was not called before or an error occurred, the entire transaction is rolled back.

Let's test it!

Now that we have written all the necessary code, it is time to make some database calls:

Even if it requires a lot of boiler code, especially for larger projects these design patterns really pay off, because they achieve a strict separation of domain and data layer, and the layers are only loosely coupled with each other.

As already mentioned, you could now derive the base classes and test the business logic without database operations. Alternatively, you can also run the unit tests with a SQLite in-memory database.

Conclusion

If you've made it so far, thanks for reading! I hope I could bring you the two design patterns a bit newer and encourage you to use these patterns for your next project. Even though there is a lot of code involved in setting up this structure, I think these patterns will pay off especially for larger projects.

All code is available at this repository, check it out!

manukanne / sqlmodel-repository-pattern

SQLModel Repository pattern

This repository includes a example implementation of the Unit of Work design and Repository design pattern.

For more details please check out my blog post!

---
title: Unit of Work + Repository pattern
---
classDiagram
     direction LR
    GenericRepository <|--GenericSqlRepository
    GenericRepository <|-- HeroReposityBase
    GenericRepository <|-- TeamRepositoryBase

    GenericSqlRepository <|-- TeamRepository
    TeamRepositoryBase <|-- TeamRepository
    GenericSqlRepository <|-- HeroRepository
    HeroReposityBase <|-- HeroRepository

    UnitOfWorkBase <|-- UnitOfWork
    HeroReposityBase *-- UnitOfWorkBase
    TeamRepositoryBase *-- UnitOfWorkBase

    HeroRepository *--UnitOfWork
    TeamRepository *--UnitOfWork

    class GenericRepository~SQLModel~{
        +get_by_id(int id) SQLModel
        +list(**filters) List~SQLModel~
        +add(T record) SQLModel
        +update(T recored) SQLModel
        +delete(id int)
    }

    class GenericSqlRepository~SQLModel~{
        -_construct_list_stmt(id)
        -_construct_list_stmt(**filters)
    }
    class HeroReposityBase{
        +get_by_name(str name) Hero
    }
    class HeroRepository{

    }
    class TeamRepositoryBase{
        +get_by_name(str name) Team
    }
    class TeamRepository{
    }
    class UnitOfWorkBase{
        teams: TeamRepositoryBase
        heroes: HeroReposityBase
        + __enter__()
        + __exit__()
        + commit()
        + rollback()
    }
    class UnitOfWork{
        teams: TeamRepository
        heroes: HeroRepository
    }

Getting Started

Tech-stack

Python 3.10
SQLModel
- Pydantic
- SQLAlchemy

Setup

1. Clone the project:

git

…

View on GitHub

Further references

Acknowledgments

Top comments (2)

Gilberto • Oct 16 '23 • Edited

Hi Manuel Kanetscheider! Great article

I am using SQLModel and implementing the repository pattern just as you described. Help me clarify a point. If I am unit testing the Unit of Work (or any concrete repository class) why should I mock only the *Base classes? I was thinking about mocking only the GenericSQLRepository methods (I am using pytest-mock). Can you describe these tests in more depth? Why should I mock the Base classes instead of the concrete classes? (I am also having trouble on mocking inherited classes, in general)

Manuel Kanetscheider • Oct 17 '23 • Edited

Hello Gilberto,
Thank you very much!

There are several options for implementing the tests, the choice depends on the specific requirements.
Besides mocking, there are two other approaches, namely the creation of fake classes or the use of an in-memory database.

In the "fake" approach, a fake repository is created for each repository and made available via a fake unit of work service. This approach requires the use of dependency injection, in the tests the actual unit of work service is replaced by the fake one or in other words, instead of the "real" unit of work service, the fake service is injected instead.
This approach requires additional boiler plate code, but is more flexible and allows the creation of independent tests that only verifies the functional design.

In the second approach, an in-memory database can also be used, which is easier to set up and does not require any changes to the existing classes. However, one can argue with this approach whether it is not already integration tests, in addition to the functional design, the generation and execution of the SQL statements are also tested, furthermore, if database-specific SQL statements are used, there is also a chance that the tests with the SQLite database will fail.

In conclusion, with the first approach only the base classes are derived and subsequently so-called fake classes are implemented. This is the case because no SQL-specific functionalities are tested here. Theoretically, a repository could also be used to encapsulate an API. The repository pattern can be extended to other data sources, in such a case the derivation of the concrete (SQL) classes would not make sense.
The choice depends on your requirements, but there is no wrong way and there are several ways to implement unit tests. Maybe this discussion will help you: softwareengineering.stackexchange....

Hope that helps you :)

DEV Community