Finally, it's time to do what I've been planning for a long time - create a detailed guide to working with the asynchronous version of SQLAlchemy 2.0 in the ORM style. This series of articles will cover absolutely all aspects: from models and relationships between them to migrations and methods of interacting with data in the database.
What's planned?
I'm going to write several articles that will be balanced between the necessary "boring" theory and practical examples to help everyone who is already familiar with Python master this "magical" framework. And believe me, you will soon understand that alchemy in the name of the framework was not chosen by chance.
Today's Topic
First, let's understand what SQLAlchemy is and why every developer working with relational databases (such as SQLite, PostgreSQL, MySQL, etc.) should know about it. After that, the setup. We'll be working with PostgreSQL, but don't worry: the code we'll write is universal for all relational databases. We'll start with the basic setup of SQLAlchemy for asynchronous interaction, and then move on to creating tables in a modern declarative style.
Table Models
Today, we'll learn how to describe table models and discuss different types of columns - from common ones (such as Integer
, String
, Text
) to specific ones (Array
, JSON
).
Table models in SQLAlchemy are a kind of Python classes that directly reflect the structure of tables in the database. Each model describes a specific table, its columns and their data types. Models allow you to work with database records as with regular objects, which greatly simplifies interaction with the database.
Today we will look at how to establish relationships at the database level using foreign keys (ForeignKey
), as well as how to use more advanced relationships in SQLAlchemy, which greatly simplify working with data thanks to the use of relationship
. Let's start with the basic types of relationships: "One to One" (1:1), "One to Many" (1:N) and "Many to One" (N:1).
The only thing we will leave aside for now is "Many to Many" (N:M) relationships and indexes. We will talk about these topics later, in the following articles.
After we have a foundation in the form of table models, we will create the tables themselves. For this, we will use the Alembic library, which allows you to easily manage database migrations and integrates perfectly with SQLAlchemy. Thanks to this tool, our application will evolve along with the database, which is especially important when working in a team.
A bit of theory
For those who are just getting acquainted with SQLAlchemy, it is important to understand that this is not just a framework, but a full-fledged tool for working with relational (tabular) databases. It supports two styles: core
and ORM
.
Core and ORM
Core is a low-level approach that allows you to execute queries using SQL expressions, providing full control over the process. This style is suitable for those who want to get as close as possible to standard SQL or have special performance requirements.
ORM (Object-Relational Mapping) is a style in which the framework maps database tables to Python classes. With ORM, you work with objects, not with SQL strings. This is the style we will focus on, as it is more convenient, universal and popular among developers.
ORM Core Components
Working with SQLAlchemy in an ORM style involves several key concepts:
Table Models are Python classes that represent database tables. These classes contain information about the structure of tables, such as columns, data types, and relationships between tables.
Sessions are objects through which interaction with the database is carried out. They allow you to execute queries and commit changes. A session is opened at the beginning of work with the database and closed at the end, providing communication with the database for the duration of one “session”.
Session Factory is a template for creating sessions. It is used to manage the connection to the database and create new sessions as needed.
Why should every Python developer know SQLAlchemy?
SQLAlchemy simplifies working with databases by turning them into intuitive Python objects. This makes your code cleaner and more readable, since you are writing Python, not SQL. In addition, it allows you to work with different databases with almost no code changes — a great feature for those working in teams that require different types of databases. SQLAlchemy also offers powerful tools for managing relationships and migrations, making it a versatile choice for developing large projects.
Relationships between tables
As mentioned earlier, SQLAlchemy allows you to establish relationships between tables through foreign keys. Let's look at the main types of relationships:
"One to One" (1:1) — used when each record in one table should correspond to only one record in another table. For example, a user profile can be related to a user account in a one-to-one relationship.
"One to Many" (1:N) — with this relationship, one record in one table can correspond to multiple records in another table. For example, one user can have multiple blog posts.
"Many to One" (N:1) — a reverse "one to many" relationship. In this case, multiple records from one table can refer to a single record in another table, for example, multiple comments can be linked to a single post.
Such relationships allow you to build complex data structures and simplify access to related data.
Let's prepare the database for work
We have several options for preparing the database for work using PostgreSQL as an example. Let's look at them:
Installing PGAdmin on a computer
Go to the official PGAdmin website and download the latest version.
Install the program, do not forget to remember the password that you create during installation.
After installation, the PostgreSQL database will work locally on your computer.
Use the following parameters to connect:
Host: localhost
Port: 5432
Database: postgres
User: postgres
This method provides access to the database on your computer only.
Deploying PostgreSQL via a Docker container
Make sure you have Docker and Docker Compose installed on your computer.
Follow the instructions in this article to set up a database via Docker-compose.
When running locally, access will only be from your computer, and when running on a VPS server, you can configure global access.
The connection parameters will be the same as those you specified when configuring the Docker Compose file.
Deploying PostgreSQL using a service
This approach is convenient because all issues related to setting up, optimizing, and running the database are taken care of by the hosting service. One such service is Amverum Cloud.
You just need to register, get a bonus of $1 for registration and launch PostgreSQL with external access in a couple of minutes. Below I will tell you how to do this, since I choose this method myself.
Raising PostgreSQL with external access on Amverum Cloud
Register in Amverum Cloud, if you have not done so yet
Go to the projects section
Click on "Create a project". Next, give the project a name, select the "Database" service type and select a tariff. "Start" is suitable for educational purposes.
The name does not have to match the database name.
- On the new screen, select the database type, give the database a name, come up with a username and password, connect the superuser and click "finish". We also set the cluster size for the DBMS to 1. This is enough for educational purposes.
- Open external access to the database. To do this, go to the created project. There, go to the "Settings" tab. On the screen that opens, click on "Add free domain name". Select the connection type "POSTGRES". Then click on "Apply".
Now we will have a free domain, which will become a host for connection.
Here are the data for connecting to the database:
HOST is the address you got in the last step.
USER is either the user you created or the postgres superuser.
PASSWORD is the password you created yourself.
DATABASE is the database you created.
If you suddenly forget some parameters, you can always open the "Configuration" tab in your project and view the necessary information.
Testing the connection to the database
Regardless of which type of database recovery you have chosen, before we start working with SQLAlchemy, let's perform a test connection to the database.
For this, I suggest installing the free and convenient DBeaver Community program. It works universally with all tabular databases, including PostgreSQL.
Download DBeaver Community for your operating system
Launch
Click on "New connection" and select PostgreSQL from the list
- Specify the connection details. I take the data received in Amverum Cloud
Connection established successfully!
- Then click on the "Test connection" button. If you see the result as in the screenshot above, you can click "Done". The connection is established successfully.
Preparing the project
Now that PostgreSQL is ready to work, let's start setting up the project in the IDE. I choose PyCharm, but it can be Atom, VSCode and other editors.
Create a project in PyCharm and activate the virtual environment.
Create a
.env
file to store environment variables for connecting to the database:
Example:
DB_HOST=alchemsy-yakvenalex.waw.amverum.cloud
DB_PORT=5232
DB_NAME=adlchemydb
DB_USER=admin
DB_PASSWORD=dsakjjASSDkk
Create a requirements.txt file and add the necessary libraries:
alembic==1.13.3
sqlalchemy==2.0.35
pydantic==2.9.2
pydantic-settings==2.5.2
asyncpg==0.29.0
Install dependencies:
pip install -r requirements.txt
Brief description of modules
Alembic: A tool for managing and migrating database schema. Used to make changes to the database structure (adding, deleting, changing tables and columns).
SQLAlchemy: The main library for working with databases in Python, providing ORM (object-relational mapping) and tools for executing SQL queries.
Pydantic: A library for data validation and data type management in Python. Pydantic allows you to conveniently define data schemas and check them for compliance with types. In the future, we will use this library to transform data from SQLAlchemy into familiar objects, such as dictionaries.
Pydantic-settings: An add-on to Pydantic that allows you to easily manage application configuration settings using Pydantic for validation and typing. We will use it to work with environment variables that will allow us to connect to a PostgreSQL database.
Asyncpg: Asynchronous engine for working with PostgreSQL via SQLAlchemy.
Creating a database
Now it's time to decide what tables we will create and how to build the application logic around them. As a visual example, let's take a simplified version of any blog.
We will have users with a minimal set of data in the main table, which will be related to other tables:
Profile table - 1:1 relationship. Each user can have only one profile.
Posts table - 1:N relationship. One user can have many posts, but each post can only be written by one user.
Comments table - 1:N relationship. One user can have many comments, and each comment can only be written by one user.
This set of models will allow us to demonstrate how to describe tables and establish relationships between them for effective work with data.
This should give you a good understanding of how SQLAlchemy works and how to build models for different data types and relationships. Time to put it into code!
Prepare the configuration file
In the root of the project, I will create a file config.py
. Its purpose is to output environment variables to the project and generate a link to connect to the PostgreSQL database. For these tasks, I will use pydantic-settings
.
Here is a complete code example:
import os
from pydantic_settings import BaseSettings, SettingsConfigDict
class Settings(BaseSettings):
DB_USER: str
DB_PASSWORD: str
DB_HOST: str
DB_PORT: int
DB_NAME: str
# DATABASE_SQLITE = 'sqlite+aiosqlite:///data/db.sqlite3'
model_config = SettingsConfigDict(
env_file=os.path.join(os.path.dirname(os.path.abspath(__file__)), ".env")
)
def get_db_url(self):
return (f"postgresql+asyncpg://{self.DB_USER}:{self.DB_PASSWORD}@"
f"{self.DB_HOST}:{self.DB_PORT}/{self.DB_NAME}")
settings = Settings()
Here we created a Settings
class, placing all the variables from the .env
file into it. In addition, we described a method that will allow us to generate a link for an asynchronous connection to the PostgreSQL database via SQLAlchemy.
Next, I assigned the settings
variable as an object of the Settings
class. Now we have the ability to access the methods and variables we need via a dot. An example is below:
print("DB URL =>", settings.get_db_url())
print("DB HOST =>", settings.DB_HOST)
In the class I left an example link for connecting to SQLite. It uses the aiosqlite
engine for asynchronous work.
Database configuration file
To work with a database in SQLAlchemy, a separate file is usually created, where the main settings are written. We will call it database.py
. Here is the minimum configuration:
from sqlalchemy.orm import DeclarativeBase
from sqlalchemy.ext.asyncio import AsyncAttrs, async_sessionmaker, create_async_engine
from config import settings
DATABASE_URL = settings.get_db_url()
# We create an asynchronous engine for working with a database
engine = create_async_engine(url=DATABASE_URL)
# Create a session factory to interact with the database
async_session_maker = async_sessionmaker(engine, expire_on_commit=False)
# Base class for all models
class Base(AsyncAttrs, DeclarativeBase):
__abstract__ = True # The class is abstract so that you don't have to create a separate table for it
Configuration Description
DeclarativeBase: The base class for all models, from which all tables (table models) will inherit. We will use this feature of the class many times.
AsyncAttrs: Allows you to create asynchronous models, which improves performance when working with asynchronous operations.
create_async_engine: A function that creates an asynchronous engine for connecting to the database at the provided URL.
async_sessionmaker: A session factory for asynchronous interaction with the database. Sessions are used to perform queries and transactions.
How it works
DATABASE_URL: Generated using the get_db_url method from the config.py configuration file. Contains all the information necessary to connect to the database.
engine: Asynchronous engine required to perform database operations.
async_session_maker: Session factory that allows creating sessions for interacting with the database, managing transactions and executing queries.
Base: Abstract base class for all models, from which all tables will be inherited. It will not create a separate table in the database, but will provide basic functionality for all other models.
This minimal set of settings provides a basic configuration for working with the database in SQLAlchemy. In the future, in this and the following articles, we will expand this file, adding new methods and settings for more complex tasks.
Let's expand the settings code
In order to better understand how this works, we will immediately add to our base class a description of three universal columns that will be substituted for each table created.
We will describe these columns only once, and they will be created in each table. In addition, let's create a universal method that will name our tables according to the following scheme: "table model name" + "s" with translation to lower case.
For example, if we had a model name User
, then the table name will be users
.
Here is the full modified code:
from datetime import datetime
from sqlalchemy import Integer, func
from sqlalchemy.orm import DeclarativeBase, declared_attr, Mapped, mapped_column
from sqlalchemy.ext.asyncio import AsyncAttrs, async_sessionmaker, create_async_engine
from config import settings
DATABASE_URL = settings.get_db_url()
# We create an asynchronous engine for working with a database
engine = create_async_engine(url=DATABASE_URL)
# Create a session factory to interact with the database
async_session_maker = async_sessionmaker(engine, expire_on_commit=False)
# Base class for all models
class Base(AsyncAttrs, DeclarativeBase):
__abstract__ = True # The class is abstract so that you don't have to create a separate table for it
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
created_at: Mapped[datetime] = mapped_column(server_default=func.now())
updated_at: Mapped[datetime] = mapped_column(server_default=func.now(), onupdate=func.now())
@declared_attr.directive
def __tablename__(cls) -> str:
return cls.__name__.lower() + 's'
The Base
base class defines three columns (id
, created_at
, updated_at
) that will be added to all models inheriting from Base
:
id
: Primary key, which is automatically incremented.created_at
: Date and time of record creation, set automatically.updated_at
: Date and time of the last update of the record, automatically updated with each change.
Let's leave this file as is for now, but we'll get back to it soon.
You've probably noticed an unfamiliar syntax for describing table columns, namely Mapped
and mapped_column
. Let's figure out what it is and how to work with it.
Mapped
Mapped
is a modern way to annotate data types for columns in SQLAlchemy models. It allows you to more clearly indicate that a variable represents a column of a table in a database, making the code more readable and understandable.
Example:
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
Here we specify that id
is an int
column that will be used as a primary key and will be automatically incremented.
mapped_column
mapped_column
is a function that is used to create columns in SQLAlchemy models. It takes as arguments the data type of the column and additional parameters such as primary_key
, nullable
, default
, etc.
Example:
created_at: Mapped[datetime] = mapped_column(server_default=func.now())
Here we create a created_at
column that will be automatically populated with the current date and time when a record is created.
How to use it
Using Mapped
and mapped_column
we can describe models more concisely and clearly.
Example for the user model:
from sqlalchemy import Integer, String
from sqlalchemy.orm import Mapped, mapped_column
class User(Base):
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
name: Mapped[str] = mapped_column(String, nullable=False)
email: Mapped[str] = mapped_column(String, unique=True, nullable=False)
In this example:
id
: An integer primary key that is automatically incremented.name
: A string that cannot be empty (nullable=False
).email
: A unique string that also cannot be empty.
Maximum flexibility and minimum code
With Mapped
and type annotations, we can greatly simplify the description of models.
Example:
from sqlalchemy.orm import Mapped, mapped_column
from sqlalchemy.ext.asyncio import AsyncAttrs, DeclarativeBase
class Base(AsyncAttrs, DeclarativeBase):
__abstract__ = True
class User(Base):
__tablename__ = 'users'
id: Mapped[int] = mapped_column(primary_key=True, autoincrement=True)
name: Mapped[str]
surname: Mapped[str | None]
email: Mapped[str] = mapped_column(unique=True)
What it means:
name: Mapped[str]
:name
is required (nullable=False
by default), of type string. No need to explicitly specifymapped_column(String, nullable=False)
.surname: Mapped[str | None]
:surname
is optional (nullable=True
), since the data type indicates that it can beNone
. No need to explicitly specifymapped_column(String, nullable=True)
.
Combinations:
The email
field shows how we can add additional parameters to mapped_column
if needed.
Benefits
Code Cleanliness: Simplifies and makes code more readable.
Type Explicitness: Indicates the data type up front, which is useful for understanding the data structure.
Flexibility: Allows easy control over whether fields are required.
Full Example
Here is how a full user model can be described using these principles:
from sqlalchemy import String, Integer, DateTime, func
from sqlalchemy.orm import Mapped, mapped_column
from sqlalchemy.ext.asyncio import AsyncAttrs, DeclarativeBase
class Base(AsyncAttrs, DeclarativeBase):
__abstract__ = True
class User(Base):
__tablename__ = 'users'
id: Mapped[int] = mapped_column(primary_key=True, autoincrement=True)
name: Mapped[str]
surname: Mapped[str | None]
email: Mapped[str] = mapped_column(unique=True)
created_at: Mapped[DateTime] = mapped_column(server_default=func.now())
updated_at: Mapped[DateTime] = mapped_column(server_default=func.now(), onupdate=func.now())
Later I will show how to simplify and shorten the code using Mapped
.
In other words, if you need something "standard", use pure Mapped
, and if you need a more detailed description, add mapped_column
.
User model
Let's start describing the models. The first model we will describe is the user model. For now, we will not include SQLAlchemy relationships (relationship
) in order not to complicate the process of writing code.
From the "unusual" we use foreign keys (ForeignKey
), thereby laying the foundation for future relationships.
What is ForeignKey in simple terms
Simply put, ForeignKey
is a way to link one table to another in the database. Imagine that you have two tables: users and posts. Each post needs an author, right?
This is where ForeignKey
comes into play. In the posts table, we add a column that will store the user ID from the users table. This allows SQLAlchemy (and the database) to know who the author of each post is.
Code example:
from sqlalchemy import Integer, String, ForeignKey
from sqlalchemy.orm import Mapped, mapped_column, relationship
from sqlalchemy.ext.asyncio import AsyncAttrs, DeclarativeBase
class Base(AsyncAttrs, DeclarativeBase):
__abstract__ = True
class User(Base):
__tablename__ = 'users'
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
name: Mapped[str]
class Post(Base):
__tablename__ = 'posts'
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
title: Mapped[str]
content: Mapped[Text]
user_id: Mapped[int] = mapped_column(ForeignKey('users.id')) # Foreign key
This allows you to link each post to a specific user, which makes working with data much more convenient and structured. Further, in this bundle, relationship
will be used - a dependency on the SQLAlchemy side, which we will also consider later.
Create a models.py file
Create a models.py
file in the root of the project and describe the first model there:
from sqlalchemy import ForeignKey
from sqlalchemy.orm import Mapped, mapped_column
from database import Base
class User(Base):
username: Mapped[str] = mapped_column(unique=True)
email: Mapped[str] = mapped_column(unique=True)
password: Mapped[str]
profile_id: Mapped[int | None] = mapped_column(ForeignKey('profiles.id'))
Note that we haven't specified a table name or assigned a primary key (user ID). We don't need to, because the Base
class will automatically add these columns when creating the table and assign it the name users
.
Optimizing code with annotations
We also noticed that the line Mapped[str] = mapped_column(unique=True)
is repeated several times. To optimize this process, we can use annotations.
Annotations in SQLAlchemy allow you to move the description of columns to separate variables. This means that we can describe the columns once and then use them in models. This approach simplifies the code and makes it more readable.
To use annotations, you need to import the Annotated
object from the typing
module. I usually describe annotations in the database.py
file. Here is an example:
from typing import Annotated
from sqlalchemy import String
from sqlalchemy.orm import mapped_column
uniq_str_an = Annotated[str, mapped_column(unique=True)]
Annotation Description
Annotated is a tool from the
typing
module that allows you to add metadata to data types. In this case, we use it to describe a column in SQLAlchemy.str is a Python data type that specifies that the column will be a string.
mapped_column(unique=True) is a function that specifies that the column will be unique, meaning that two values in this column cannot be repeated.
An example of using annotation:
from sqlalchemy import ForeignKey
from sqlalchemy.orm import Mapped, mapped_column
from database import Base, uniq_str_an
class User(Base):
username: Mapped[uniq_str_an]
email: Mapped[uniq_str_an]
password: Mapped[str]
profile_id: Mapped[int | None] = mapped_column(ForeignKey('profiles.id'))
The code is more readable now, and it's starting to make sense why the framework we're talking about is called SQLAlchemy, right?
We won't go into detail about data types, since if you've read this far, you probably have a basic understanding of Python and database data types.
User Profile Table
Now we'll describe the user profile model (table). We'll include a few extra data types here to reinforce the point.
I suggest adding the following fields:
First name: string, required field
Last name: string, optional field
Age: integer
Gender: male / female — let's make one of the two values (ENUM) work
Profession: one of the provided values (ENUM)
Interests: list (Array) of strings
Contacts: JSON (Python dictionary) with an arbitrary list of contacts.
Before describing the model itself, let's prepare a set of values for gender and professions. To do this, let's first import enum (regular enum, not from SQLAlchemy).
import enum
Enum from the enum module in Python is used to create enumerations, which are sets of named values. This allows you to define data types with a limited set of possible values.
Enumerations for gender:
class GenderEnum(str, enum.Enum):
MALE = "male"
FEMALE = "female"
Listings for professions:
class ProfessionEnum(str, enum.Enum):
DEVELOPER = "developer"
DESIGNER = "designer"
MANAGER = "manager"
TEACHER = "teacher"
DOCTOR = "doctor"
ENGINEER = "engineer"
MARKETER = "marketer"
WRITER = "writer"
ARTIST = "artist"
LAWYER = "lawyer"
SCIENTIST = "scientist"
NURSE = "nurse"
UNEMPLOYED = "unemployed"
Now let's describe the model itself.
class Profile(Base):
first_name: Mapped[str]
last_name: Mapped[str | None]
age: Mapped[int | None]
gender: Mapped[GenderEnum]
profession: Mapped[ProfessionEnum] = mapped_column(
default=ProfessionEnum.DEVELOPER,
server_default=text("'UNEMPLOYED'")
)
interests: Mapped[List[str] | None] = mapped_column(ARRAY(String))
contacts: Mapped[dict | None] = mapped_column(JSON)
Unfortunately, the ARRAY and JSON data types cannot be described via pure Mapped yet, so I used mapped_column here. Everything else will work correctly.
Using default and server_default parameters
You probably noticed that we used the default
and server_default
parameters for the profession. Let's take a closer look at how they differ and when to use them.
-
Parameter
default
:
This parameter sets the default value at the application (SQLAlchemy) level.
This means that if a value for this field is not specified when creating an object in the code, the value specified in
default
will be used. For example, when creating an object of theUser
class, if a value for theprofession
field is not passed, SQLAlchemy will automatically substitute the default value specified indefault
.Example: If we have an enumeration (ENUM) of professions, then the default value can be selected using a dot, for example:
ProfessionEnum.DEVELOPER
.
-
Parameter
server_default
:
This parameter sets the default value at the database level.
This means that if a value for this field is not specified when inserting a record into a table, the database itself will substitute the value specified in
server_default
. Unlikedefault
, this value is applied if a record is added to a table directly, for example, via SQL queries, bypassing the application.Important: To use this parameter with ENUM, you must pass the value as a text expression using the
text
method, which is imported from SQLAlchemy. The ENUM value is specified in quotes as text, for example:"WRITER"
, and not the value itself, such asProfessionEnum.WRITER
. This is necessary for the correct execution of the query on the database side.
Model for posts
Now we will describe a simple model for posts. I will prepare a model with values for the post publication status in advance.
class StatusPost(str, enum.Enum):
PUBLISHED = "published"
DELETED = "removed"
UNDER_MODERATION = "under moderation"
DRAFT = "draft"
SCHEDULED = "delayed publication"
Now let's describe the model itself
class Post(Base):
title: Mapped[str]
content: Mapped[Text]
main_photo_url: Mapped[str]
photos_url: Mapped[List[str] | None] = mapped_column(ARRAY(String))
status: Mapped[StatusPost] = mapped_column(default=StatusPost.PUBLISHED, server_default=text("'DRAFT'"))
user_id: Mapped[int] = mapped_column(ForeignKey('users.id'))
A fairly simple model that will contain all the required fields: title, text content (let's assume HTML-formatted text), link to the main photo, links to additional photos, publication status from the set, and user_id for the link.
The list description field has already been encountered earlier. Let's briefly outline it in the annotation.
array_or_none_an = Annotated[List[str] | None, mapped_column(ARRAY(String))]
In our current code, enumerations (ENUMS) take up a lot of space. Let's move them to a separate file called sql_enums.py
.
import enum
class GenderEnum(str, enum.Enum):
MALE = "male"
FEMALE = "female"
class StatusPost(str, enum.Enum):
PUBLISHED = "published"
DELETED = "removed"
UNDER_MODERATION = "under moderation"
DRAFT = "draft"
SCHEDULED = "delayed publication"
class ProfessionEnum(str, enum.Enum):
DEVELOPER = "developer"
DESIGNER = "designer"
MANAGER = "manager"
TEACHER = "teacher"
DOCTOR = "doctor"
ENGINEER = "engineer"
MARKETER = "marketer"
WRITER = "writer"
ARTIST = "artist"
LAWYER = "lawyer"
SCIENTIST = "scientist"
NURSE = "nurse"
UNEMPLOYED = "unemployed"
Model with comments
In this model, we will need to associate a comment with both the author of the comment and the post to which this comment was left.
Also, as a demonstration, let's add here the is_publish column - a Boolean value in the format of whether the comment was published or not. Just to see how such fields are described.
And add a rating from 1 to 10, to consolidate the topic with ENUM.
Describe ENUM
class RatingEnum(int, enum.Enum):
ONE = 1
TWO = 2
THREE = 3
FOUR = 4
FIVE = 5
SIX = 6
SEVEN = 7
EIGHT = 8
NINE = 9
TEN = 10
Now let's describe the model itself:
class Comment(Base):
content: Mapped[Text]
user_id: Mapped[int] = mapped_column(ForeignKey('users.id'))
post_id: Mapped[int] = mapped_column(ForeignKey('posts.id'))
is_published: Mapped[bool] = mapped_column(default=True, server_default=text("'false'"))
rating: Mapped[RatingEnum] = mapped_column(default=RatingEnum.FIVE, server_default=text("'SEVEN'"))
If you have been reading this article carefully, by now you should have a full understanding of how to describe columns using the SQLAlchemy ORM in a modern style. Now we will move on to a more complex topic - describing dependencies between tables.
Defining dependencies
Here I will try to explain the topic of dependencies in as much detail as possible, as I know that beginners often have difficulties at this stage. My goal is to help you fully understand how to describe relationships in SQLAlchemy models.
One-to-One
1. Example of a relationship
We have two models: User
and Profile
. The relationship between them is organized according to the principle of One-to-One. This means that each user has only one profile, and each profile belongs to only one user.
2. Why One-to-One?
Typically, a one-to-one relationship is used to separate primary and secondary information. For example, user information (name, email) can be stored in the User
table, and profile data (age, profession, interests) is stored in a separate Profile
table. This avoids unnecessary width of the main table, which simplifies its maintenance and data management.
3. Technical implementation
In SQLAlchemy, the relationship
technology allows you to set up and manage relationships between tables. In the case of a one-to-one relationship between the User
and Profile
tables, the setup looks like this:
Relationship in the User model:
profile: Mapped["Profile"] = relationship(
"Profile",
back_populates="user",
uselist=False, # Key parameter for one-to-one communication
lazy="joined" # Automatically loads profile when user requests
)
back_populates="user"
— specifies the back attribute of theProfile
model. This means that when accessing a profile, you can also get the associated user.uselist=False
— specifies that the relationship is not a list (by default,relationship
assumes a one-to-many or many-to-one relationship). Here, one profile per user, souselist=False
specifies a one-to-one relationship.lazy="joined"
— specifies the data loading strategy. SQLAlchemy will perform aJOIN
query and load the profile immediately when the user is requested.
Feedback in the Profile model:
user: Mapped["User"] = relationship(
"User",
back_populates="profile",
uselist=False
)
back_populates="profile"
— links this relationship to theprofile
attribute on theUser
model, allowing for bidirectional control. If you link a user and a profile, both attributes (user
andprofile
) will be updated at the same time.uselist=False
— since each profile can only have one user,False
indicates that this relationship does not support lists, but implies a single object.
How does it work?
When a User
object is created with a Profile
attached, SQLAlchemy automatically links them via the relationship definition. For example, if we get a user via an ORM select
query, SQLAlchemy will perform a JOIN
query and load the profile if joinedload
is used for the lazy='joined'
option. We will cover this in more detail in the next article.
One-to-Many and Many-to-One Relationships
1. Relationship Example
In our User
model, the relationship with the Post
model is one-to-many: one user can create many posts, but each post belongs to only one user. This relationship is also called many-to-one on the Post
side, since multiple posts can refer to the same user.
Similarly, the Post
model has a one-to-many relationship with Comment
: each post can have multiple comments, but each comment refers to a specific post.
2. Why use One-to-Many and Many-to-One?
A one-to-many or many-to-one relationship is used when one object (such as a user) is related to many other objects (such as posts). This allows you to logically group data and avoid redundancy. In our case, each user can have multiple posts, and each post can have multiple comments.
3. Technical implementation
One-to-many and many-to-one relationships can be configured using relationship
and ForeignKey
.
Relationship between User and Post:
class User(Base):
# User fields...
posts: Mapped[list["Post"]] = relationship(
"Post",
back_populates="user",
cascade="all, delete-orphan" # Deletes posts when user is deleted
)
class Post(Base):
# Post fields...
user_id: Mapped[int] = mapped_column(ForeignKey('users.id'))
user: Mapped["User"] = relationship(
"User",
back_populates="posts"
)
user_id
: InPost
, this is a foreign key that links each post to a specific user.back_populates
:
back_populates="user"
inPost
indicates a relationship with theposts
field inUser
.back_populates="posts"
inUser
links with theuser
field inPost
. These settings ensure a two-way relationship.
-
cascade="all, delete-orphan"
: This setting in theUser
model specifies that all posts associated with the user should be deleted if the user itself is deleted.
Relationship between Post and Comment:
class Post(Base):
# Post fields...
comments: Mapped[list["Comment"]] = relationship(
"Comment",
back_populates="post",
cascade="all, delete-orphan"
)
Here, thanks to relationship
, you can associate objects with each other when you create them, and SQLAlchemy will automatically determine which records they belong to.
Try to associate comments and users yourself before you open the full code. I emphasize that understanding these concepts is very important for a general understanding of interaction with data.
Did you write it? Great, then you can open the spoiler with the full code.
Hidden text
from sqlalchemy import ForeignKey, JSON, text
from sqlalchemy.orm import Mapped, mapped_column, relationship
from database import Base, uniq_str_an, array_or_none_an, content_an
from sql_enums import GenderEnum, ProfessionEnum, StatusPost, RatingEnum
class User(Base):
username: Mapped[uniq_str_an]
email: Mapped[uniq_str_an]
password: Mapped[str]
profile_id: Mapped[int | None] = mapped_column(ForeignKey('profiles.id'))
# One-to-one connection with Profile
profile: Mapped["Profile"] = relationship(
"Profile",
back_populates="user",
uselist=False, # Provides one-to-one communication
lazy="joined" # Automatically loads related data from Profile when a User is requested
)
# One-to-many relationship with Post
posts: Mapped[list["Post"]] = relationship(
"Post",
back_populates="user",
cascade="all, delete-orphan" # When deleting a User, the associated Post is also deleted
)
# One-to-many relationship with Comment
comments: Mapped[list["Comment"]] = relationship(
"Comment",
back_populates="user",
cascade="all, delete-orphan" # When deleting a User, the associated Comment is also deleted
)
class Profile(Base):
first_name: Mapped[str]
last_name: Mapped[str | None]
age: Mapped[int | None]
gender: Mapped[GenderEnum]
profession: Mapped[ProfessionEnum] = mapped_column(default=ProfessionEnum.DEVELOPER,
server_default=text("'UNEMPLOYED'"))
interests: Mapped[array_or_none_an]
contacts: Mapped[dict | None] = mapped_column(JSON)
# One-to-one feedback with User
user: Mapped["User"] = relationship(
"User",
back_populates="profile",
uselist=False
)
class Post(Base):
title: Mapped[str]
content: Mapped[content_an]
main_photo_url: Mapped[str]
photos_url: Mapped[array_or_none_an]
status: Mapped[StatusPost] = mapped_column(
default=StatusPost.PUBLISHED,
server_default=text("'DRAFT'"),
nullable=False
)
user_id: Mapped[int] = mapped_column(ForeignKey('users.id'))
# Many-to-one relationship with User
user: Mapped["User"] = relationship(
"User",
back_populates="posts"
)
# One-to-many relationship with Comment
comments: Mapped[list["Comment"]] = relationship(
"Comment",
back_populates="post",
cascade="all, delete-orphan"
)
class Comment(Base):
content: Mapped[content_an]
user_id: Mapped[int] = mapped_column(ForeignKey('users.id'))
post_id: Mapped[int] = mapped_column(ForeignKey('posts.id'))
is_published: Mapped[bool] = mapped_column(default=True, server_default=text("'false'"))
rating: Mapped[RatingEnum] = mapped_column(default=RatingEnum.FIVE, server_default=text("'SEVEN'"))
# Many-to-one relationship with User
user: Mapped["User"] = relationship(
"User",
back_populates="comments"
)
# Many-to-one relationship with Post
post: Mapped["Post"] = relationship(
"Post",
back_populates="comments"
)
Real Table Creation
We have described the models of our future tables, but to turn them into real tables in the database, we need to perform the migration process. SQLAlchemy offers several ways to create and update tables based on models, and one of the most popular and powerful tools for this task is Alembic.
Alembic allows you to manage versions of the database schema, track changes and easily migrate them to different environments. This tool simplifies the creation and execution of migrations, keeping the database up to date as the models change.
Let's look at how exactly we can use Alembic to migrate our models to the database to simplify further work with data and maintain convenience and control over changes.
To start working with Alembic, we need to initialize it with support for asynchronous interaction with the database. This can be done with the following command:
alembic init -t async migration
After running this command, a migration directory and an alembic.ini file will appear in the root of the project. The migration directory will contain files for managing migrations, and alembic.ini is the Alembic configuration file that we will need to set up a connection to the database.
Configuring migration/env.py to work with the database
In order for Alembic to be able to correctly interact with our database, we need to make changes to the migration/env.py
file. This file is responsible for configuring migrations, and in it we need to specify information for connecting to the database, as well as import all the models.
Configuration steps
1. Importing the connection and models
First of all, we import a reference to connect to the database and the Base
base class from the database.py
file, as well as all our models from the models.py
file:
from database import Base, DATABASE_URL
from models import User, Comment, Post, Profile
2. Connection configuration
Next, we tell Alembic which URL to use to connect to the database:
config = context.config
config.set_main_option("sqlalchemy.url", DATABASE_URL)
Here we use the DATABASE_URL
variable, which contains the connection string to our PostgreSQL database.
3. Defining Metadata
We tell Alembic where to look for information about models. To do this, we assign the target_metadata
variable the metadata from Base
, which includes all models:
target_metadata = Base.metadata
These steps prepare the env.py
file to work with your database. When we set target_metadata
, Alembic gets access to the structure of all our models and uses them to create or update the database schema, adding new tables and columns as the models change. The env.py
file thus becomes the glue between your SQLAlchemy models and your migrations in Alembic, allowing you to easily make and track changes to your database.
Changed code migration/env.py
import asyncio
from logging.config import fileConfig
from sqlalchemy import pool
from sqlalchemy.engine import Connection
from sqlalchemy.ext.asyncio import async_engine_from_config
from alembic import context
from database import Base, DATABASE_URL
from models import User, Comment, Post, Profile
config = context.config
config.set_main_option("sqlalchemy.url", DATABASE_URL)
if config.config_file_name is not None:
fileConfig(config.config_file_name)
target_metadata = Base.metadata
# leave the rest of the code unchanged
Preparing the migration file
Now, to prepare the migration file, let's create an instruction for Alembic to use to create the tables. In the terminal, run the following command:
alembic revision --autogenerate -m "Initial revision"
This command will create a migration file that Alembic will use to create our tables based on the models.
If everything was entered correctly, you will see a message like this. Here Alembic reports that it has found the tables and then says that it has generated a file with instructions for migrations at the specified address.
If we look into this file, we will see that each column is described in a more familiar style for SQL CORE. This indicates that we have managed to clearly explain to Alembic what fields we want to get at the output.
Important nuances of working with Alembic and ENUM data types
When using Alembic to manage migrations, there are several important features related to ENUM data types in PostgreSQL that need to be taken into account. Let's look at two main issues and how to solve them.
1. Issues when creating a column with ENUM
Once the ENUM data type is created in PostgreSQL, its name is reserved and considered unique. When working with Alembic, this feature is not always taken into account automatically, which can lead to errors in the future.
Example of a problem
Let's say we have a column like this:
sa.Column('gender', sa.Enum('MALE', 'FEMALE', name='genderenum'), nullable=False)
When you first run the migration, Alembic will correctly create the table and ENUM type for this column. However, if you want to change this ENUM in the future (for example, add a new value), you may encounter an error when running repeated migrations, since Alembic will try to create a genderenum
type that already exists in the database.
Solution
To avoid this problem, you need to explicitly set the create_type=False
parameter so that Alembic does not try to re-create the ENUM type:
sa.Column('gender', sa.Enum('MALE', 'FEMALE', name='genderenum', create_type=False), nullable=False)
Tip: Always specify create_type=False
for ENUM columns to avoid conflicts when re-running migrations.
2. Issues when downgrading migrations
When you drop tables with Alembic, the tables are dropped, but the ENUMs associated with them remain in the database. This can lead to conflicts if you want to use the same names for new ENUM types in the future.
Solution
To make Alembic correctly drop ENUM types when you roll back migrations, you need to extend the downgrade
method as follows:
def downgrade() -> None:
# Deleting tables
op.drop_table('comments')
op.drop_table('posts')
op.drop_table('users')
op.drop_table('profiles')
# Removing ENUM types
op.execute('DROP TYPE IF EXISTS ratingenum')
op.execute('DROP TYPE IF EXISTS genderenum')
op.execute('DROP TYPE IF EXISTS professionenum')
op.execute('DROP TYPE IF EXISTS statuspost')
In this case, when rolling back the migration, not only the tables will be deleted, but also all the corresponding ENUM types.
Important: Use this method only when you need to delete the ENUM types themselves, to avoid unnecessary deletion if the types may be needed in other parts of the application.
Performing Migrations
Now that you have made all the necessary edits and taken into account the peculiarities of working with ENUM, you can start migrating the database.
Updating the database to the latest migration version
To update the database to the latest version, run the command:
alembic upgrade head
This command will apply all migrations that are in the migrations/versions
directory and bring the database structure up to date. upgrade head
will update the database schema to the latest version that was created and registered in the migration files.
Performing a migration to a specific ID
If you need to update the database not to the latest version, but to a specific migration, you can specify the ID of the desired migration. For example:
alembic upgrade d97a9824423b
In this case, d97a9824423b
is a unique ID of the migration you need. Alembic will perform all migrations up to the specified ID, updating the database to the state corresponding to this version.
Performing migration
I see that everything went well. I'll go to the database and check.
All tables are in place.
Rolling Back Migrations: Downgrade
We've already covered the topic of rolling back migrations, and now I want to make sure you've made all the necessary edits so that not only tables are dropped when rolling back, but also their associated ENUM data types.
Sometimes you may need to roll back changes to your database, for example if a new version has introduced bugs or requires additional work. Alembic supports the ability to roll back migrations using the downgrade
command, which allows you to return your database to a previous or specific state.
Rollback one version
To rollback a migration one version, use the following command:
alembic downgrade -1
Rollback to a specific migration
If you need to rollback the database to a specific migration, specify the ID of this migration:
alembic downgrade d97a9824423b
This command will revert the database to the state corresponding to the migration with ID d97a9824423b
and remove all changes applied after it.
Downgrade() example considering ENUM types removal
Here is what the downgrade
method should look like after making all the necessary edits:
def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
op.drop_table('comments')
op.drop_table('posts')
op.drop_table('users')
op.drop_table('profiles')
# ### end Alembic commands ###
# Removing ENUM types
op.execute('DROP TYPE IF EXISTS ratingenum')
op.execute('DROP TYPE IF EXISTS genderenum')
op.execute('DROP TYPE IF EXISTS professionenum')
op.execute('DROP TYPE IF EXISTS statuspost')
Breakdown:
Dropping tables: The
op.drop_table()
commands remove all specified tables from the database. This is automatically generated by Alembic and removes tables based on the current state of migrations.Dropping ENUM types: We are adding commands to remove ENUM types that may remain in the database even after the associated tables are dropped. The
op.execute('DROP TYPE IF EXISTS ...')
command removes ENUM types if they exist.
Testing the rollback
Once you have made your edits and are sure you are ready, you can rollback:
- Roll back one version:
alembic downgrade -1
- Rollback to a specific migration:
alembic downgrade d97a9824423b
This process will delete both tables and associated ENUM types, gracefully reverting the database to its previous states.
We didn't get any errors, so it worked as expected.
Now, with these changes in place, you can confidently manage migration rollbacks, knowing that Alembic will handle both tables and ENUM types in your database gracefully.
Conclusion
Guys, today we have dived deep into the basics of working with SQLAlchemy. We have discussed setting up a database, creating tables, establishing relationships between them, and also looked at the process of migrations using Alembic. I understand that the volume of material turned out to be quite impressive, and it may be difficult to grasp everything at first. But do not worry - practice will help you consolidate your knowledge!
I tried to explain everything as clearly as possible, but the best way to master this topic is to continue working with the code. Take your time, carefully study everything we have talked about, and do not hesitate to consult additional information. This is especially true for working with relationships between tables - without understanding them, it will be difficult to move on.
In the following articles, we will move on to more practical issues: adding, retrieving and filtering data in tables. We will dive deeper into working with relationships between tables so that you can see how this is applied in practice and why setting up relationships is so important.
My plan is to first go through all the intricacies of working with asynchronous SQLAlchemy. We will complete this mini-course by creating a full-fledged project on FastAPI, where we will apply all the knowledge we have gained in practice.
You can easily deploy the finished project, as well as the PostgreSQL database that we posted in this article, on Amverum Cloud in just a few minutes - but we will talk about this separately.
Your response will determine how soon the next article on asynchronous work with SQLAlchemy will be published. Preparing such materials requires time and effort, so any support from you will be very useful!
You can find the full source code used in this article, as well as exclusive content that I do not publish on Habr, in my Telegram channel “Easy Path to Python”. The channel already has almost 1000 participants who study and discuss Python together!
See you soon!
Top comments (0)