In the dynamic world of web development, Python has emerged as a dominant force, especially in backend development – the primary focus of this blog post. Although it's worth mentioning that there are ongoing efforts to use Python for the frontend as well, like Reflex (previously known as Pynecone, they presumably had to change their name because of Pinecone vector database), which even garnered support from Y Combinator. Samuel Colvin (creator of Pydantic) is also working on FastUI (he literally just released the first version in December 2023).
But let's pivot back to where Python really shines – the backend. This post will take you through some of the coolest stuff in Python backend development, starting with how FastAPI and Pydantic are changing the game in API development, with their fast, efficient and Pythonic approach.
To make this blog post more hands-on, I have put together a Github repository where you can find all these best practices implemented: https://github.com/matinone/quiz-app.
If you are only interested in Python general best practices, not specific to backend development, feel free to skip the first sections until Pytest or Poetry.
TABLE OF CONTENTS
FastAPI and Pydantic
FastAPI is "a modern, fast (high-performance), web framework for building APIs with Python 3.8+ based on standard Python type hints". It is not only fast to run (one of the fastest Python frameworks available, on par with Go and NodeJS), but also fast to code. As a developer, you will appreciate its intuitive design and ease of use, leading to reduced development time and costs.
The framework's efficiency comes from its use of Starlette for building asynchronous web services and Pydantic for robust data validation and serialization, powered by Python's type hints. Pydantic has recently announced the official release of Pydantic V2 (June 2023), which is a ground-up rewrite that offers many new features and performance improvements, so make sure to be using that instead of V1.
The best way to appreciate FastAPI is with a small code sample showing how simple and powerful it is.
from fastapi import FastAPI
app = FastAPI()
@app.get("/items/{item_id}")
async def read_item(item_id: int):
return {"item_id": item_id}
The above code defines an endpoint /items/{item_id}
with a path parameter item_id
. By just using type hints (item_id: int
), you get:
- Editor support: error checks and autocompletion.
- Data parsing: it converts the string that comes from an HTTP request into Python data (an
int
in this case). - Data validation: it raises an error if the path parameter isn't an integer, so
/items/pi
and/items/3.14
would both return an error. - Automatic documentation: it generates an automatic and interactive API documentation available at
/docs
.
And we haven't even explicitly used a Pydantic model yet. FastAPI has an awesome documentation, so you can directly take a look at all the available features and the official tutorial for more details.
Async SQLAlchemy
SQLAlchemy is a favorite in the Python community for working with databases, particularly known for its powerful Object-Relational Mapping (ORM) capabilities. An ORM basically maps Python classes to database tables, class attributes to table columns, and instances of the class to rows in the table, allowing the interaction with a database using Python objects instead of writing raw SQL queries.
SQlAlchemy 2.0 was released in January 2023, with lots of changes and improvements, fully supporting asynchronous operations. Asynchronous programming allows a single-threaded program to handle multiple operations concurrently, making it specially suitable for I/O-bound tasks like database interactions. In traditional synchronous programming, database operations can be a bottleneck, as the program has to wait for each query to complete before moving on to the next task. With async programming, the application can continue to run other tasks while waiting for database operations to complete, leading to more efficient resource utilization and better overall performance.
FastAPI’s asynchronous nature allows it to handle multiple requests simultaneously without getting blocked by database operations. When using Async SQLAlchemy, database queries are executed in a non-blocking manner, meaning your application can serve other requests while waiting for the database to respond.
With the background on Async SQLAlchemy, Pydantic and FastAPI covered, let’s now see them all together in action. What we have here is a simplified version to showcase how these tools can be integrated. For a more detailed implementation, remember to check out the GitHub repository mentioned in the introduction.
First, let's start with some Pydantic models (or schemas) to represent a quiz.
from datetime import datetime
from pydantic import BaseModel, ConfigDict
class QuizCreate(BaseModel):
title: str
description: str | None = None
model_config = ConfigDict(from_attributes=True)
class QuizReturn(BaseModel):
id: int
title: str
description: str | None = None
created_at: datetime
updated_at: datetime
model_config = ConfigDict(from_attributes=True)
The model_config = ConfigDict(from_attributes=True)
(ORM mode / from_orm
in Pydantic V1) line allows the models to be created from arbitrary class instances by matching up the instance attributes with the model fields. This is incredibly handy for converting SQLAlchemy models into Pydantic models when returning data from a FastAPI endpoint.
Next up, let's define our SQLAlchemy model.
from sqlalchemy import DateTime, String, select
from sqlalchemy.ext.asyncio import AsyncSession,
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column
from sqlalchemy.sql import func
from schemas.quiz import QuizCreate
class Base(DeclarativeBase):
pass
class Quiz(Base):
__tablename__ = "quizzes"
id: Mapped[int] = mapped_column(primary_key=True)
title: Mapped[str] = mapped_column(String(128), nullable=False, index=True)
description: Mapped[str] = mapped_column(String(512), nullable=True)
created_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True), server_default=func.now()
)
updated_at: Mapped[datetime] = mapped_column(DateTime(timezone=True))
@classmethod
async def create(cls, db: AsyncSession, quiz: QuizCreate) -> Self:
new_quiz = cls(title=quiz.title, description=quiz.description)
new_quiz.updated_at = func.now()
db.add(new_quiz)
await db.commit()
await db.refresh(new_quiz)
return new_quiz
@classmethod
async def get(cls, db: AsyncSession, id: int) -> Self | None:
result = await db.execute(select(cls).where(cls.id == id))
return result.scalar()
A few key things to note in this SQLAlchemy model:
- Some CRUD operations are implemented as class methods of the model itself. While there are other ways to handle CRUD, like using separate classes (repository pattern) or mixing it into the endpoint logic, having it here, close to the definition of the available fields, keeps things concise and organized, and still makes our code reusable.
- The
create
andget
methods are asynchronous, ensuring our database operations don’t block other processes and play nicely with FastAPI's async nature. - The model attributes use the latest
Mapped
andmapped_column
for typing annotation, fully embracing SQLALchemy 2.0. - The
create
method takes aQuizCreate
Pydantic model as a parameter, showing how Pydantic models can control how data is passed into our database operations.
Finally, let's implement an endpoint to create a new quiz.
from fastapi import FastAPI, status
from models.Quiz import Quiz
from schemas.quiz import QuizCreate, QuizReturn
from models.database import AsyncSessionDep
app = FastAPI(title="Quiz App")
@app.post(
"/quizzes",
response_model=QuizReturn,
status_code=status.HTTP_201_CREATED,
summary="Create a new quiz",
response_description="The new created quiz",
)
async def create_quiz(db: AsyncSessionDep, quiz: QuizCreate) -> Any:
new_quiz = await Quiz.create(db=db, quiz=quiz)
return new_quiz
Once again, there are a few key things to note:
- Input parsing and validation: the
quiz: QuizCreate
in the function signature tells FastAPI to expect a request body that matches the structure of theQuizCreate
Pydantic model. FastAPI parses the request body and validates it, checking data types and mandatory fields. If the incoming data does not conform to theQuizCreate
model, it automatically returns an error response with details about the validation errors. - Output conversion, filtering and serialization: the
response_model=QuizReturn
parameter in the route decorator tells FastAPI to use theQuizReturn
Pydantic model to serialize and filter the output data. Before sending the response to the client, FastAPI serializesnew_quiz
using theQuizReturn
model (this is whyfrom_attributes=True
was important), which means that it will convert the SQLAlchemy model instance (new_quiz
) into a JSON response, ensuring that the response matches the structure defined in QuizReturn. This serialization process also acts as a filter, only the fields defined inQuizReturn
will be included in the response.
In summary, FastAPI, SQLAlchemy and Pydantic work together to validate incoming data against a defined schema (QuizCreate
), handle any validation errors, and then serialize and filter the outgoing data according to another schema (QuizReturn
).
There is one final important thing to keep in mind regarding async support in SQLAlchemy 2.0: many of the great features of SQLAlchemy are possible because of database instructions issued implicitly (for example as a result of the application accessing an attribute of a model instance). When working with asynchronous operations, implicit I/O as a result of an attribute being accessed is not possible because all database activity must happen in the context of an await function call, so special care must be taken to make it work as intended.
Alembic
Alembic is a lightweight database migration tool for usage with SQLAlchemy. In this context, migration means changes to the database schema (add a new column to a table, modify the type of an existing column, create a new index, etc.). Alembic "provides for the creation, management, and invocation of change management scripts for a relational database, using SQLAlchemy as the underlying engine". It is designed to handle changes to the database schema over time, allowing for a version-controlled approach to database evolution, so you can keep track of changes and revert back to previous versions if necessary.
When initializing Alembic in the project, make sure to use the -t async
option for asynchronous support.
One of Alembic's key features is its ability to auto-generate migration scripts. By analyzing the current database state and comparing it with the application's table metadata, Alembic can automatically generate the necessary migration scripts using the --autogenerate
flag in the alembic revision
command. Note that autogenerate does not detect all database changes and it is always necessary to manually review (and correct if needed) the candidate migrations that autogenerate produces.
For more details about how to actually use Alembic, you can check this other post.
Pytest
Testing is an integral part of any robust backend development process, and Pytest stands out as the ideal Python testing framework, allowing you to write tests quickly and with minimal boilerplate code.
Testing an asynchronous application, like one using FastAPI and Async SQLAlchemy, requires some special considerations. We need tools that can handle the asynchronous nature of our application, specifically an async HTTP client, an async database engine and the ability to run asynchronous test cases.
We can use the async HTTP client provided by httpx
, a fully featured HTTP client for Python with an API broadly compatible with requests
, so it can be used in pretty much the same way in most cases.
from typing import AsyncGenerator
import pytest
from httpx import AsyncClient
@pytest.fixture(scope="function")
async def client(db_session) -> AsyncGenerator[AsyncClient, None]:
# override get_session dependency to return the DB session
# from the fixture (instead of creating a new one)
def override_get_session():
yield db_session
app.dependency_overrides[get_session] = override_get_session
async with AsyncClient(app=app, base_url="http://test") as client:
yield client
Note that we are using a Pytest fixture to pass the client to each test case, and we are also overriding the session dependency that the different endpoints use, so they use the session we create for the tests.
By default, Pytest doesn't support the execution of asynchronous code, but we can use the pytest-asyncio
plugin to support it. In order to automatically detect async def test_*
functions as proper tests, we can add the asyncio_mode=auto
to the pyproject.toml
file.
[tool.pytest.ini_options]
asyncio_mode = "auto"
Finally, we need an asynchronous database engine. An important aspect of testing applications that interact with a database is ensuring that tests do not interfere with each other by modifying the database state. To achieve this, we can set up our database sessions in such a way that each test case runs within a transaction that is rolled back at the end of the test. This means that any changes made to the database during a test are undone at the end of it, leaving the database in a clean state for the next test.
@pytest.fixture(scope="session")
def event_loop():
"""
Custom session-scoped event loop fixture, created to be able to use the
db_connection fixture with a session scope.
"""
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
yield loop
loop.close()
@pytest.fixture(scope="session", autouse=True)
async def db_connection() -> None:
"""
Fixture to create database tables from scratch for each test session.
"""
# always drop and create test DB tables between test sessions
async with async_engine.connect() as connection:
await connection.run_sync(Base.metadata.drop_all)
await connection.run_sync(Base.metadata.create_all)
@pytest.fixture(scope="function")
async def db_session() -> AsyncGenerator[AsyncSession, None]:
async with async_engine.connect() as conn:
await conn.begin()
# enforce foreign key contraints (PRAGMA foreign_keys applies to a connection)
await conn.execute(text("PRAGMA foreign_keys = 1"))
await conn.begin_nested()
async_session = AsyncSessionLocal(bind=conn, expire_on_commit=False)
# ensures a savepoint is always available to roll back to
@event.listens_for(async_session.sync_session, "after_transaction_end")
def end_savepoint(session: Session, transaction: SessionTransaction) -> None:
if conn.closed:
return
if not conn.in_nested_transaction():
if conn.sync_connection:
conn.sync_connection.begin_nested()
for factory in factory_list:
factory._meta.sqlalchemy_session = async_session # type: ignore
yield async_session
await async_session.close()
await conn.rollback()
await async_engine.dispose()
Using all these fixtures, a very basic test case could look like this.
async def test_create_quiz(client: AsyncClient, db_session: AsyncSession):
quiz_data = {"title": "My quiz", "description": "My quiz description"}
response = await client.post("/quizzes", json=quiz_data)
assert response.status_code == status.HTTP_201_CREATED
Factory Boy
When it comes to testing, particularly in applications involving databases, generating test data can be a tedious and repetitive task. This is where Factory Boy comes into play, as a Python library designed to make it easier and more efficient to set up test data, and that also works really well with SQLAlchemy.
Factory Boy allows you to define 'factories' for your data models. These factories are blueprints for creating instances of your models, filled with default data, randomized data or customized data for the current test.
Something to keep in mind is that the database session created in the db_session
fixture must be associated to each factory, that's why the fixture contains the line factory._meta.sqlalchemy_session = async_session
.
Factory Boy doesn't currently support asynchronous operations, but there is an async-factory-boy
extension with enough async support for most use cases. There is also an open pull request in Factory Boy with recent updates (July 2023) that will hopefully be merged soon.
# question_factory.py
from async_factory_boy.factory.sqlalchemy import AsyncSQLAlchemyFactory
class QuestionFactory(AsyncSQLAlchemyFactory):
id = factory.Sequence(lambda x: x)
content = factory.Faker("sentence")
type = factory.Iterator([e.value for e in QuestionType])
points = factory.LazyAttribute(lambda x: random.randint(0, 10))
created_at = factory.LazyFunction(datetime.now)
updated_at = factory.LazyFunction(datetime.now)
quiz = factory.SubFactory(QuizFactory)
class Meta:
model = Question
sqlalchemy_session_persistence = "commit"
# test_questions.py
async def test_get_question(client: AsyncClient, db_session: AsyncSession):
created_question = await QuestionFactory.create()
response = await client.get(f"/questions/{created_question.id}")
assert response.status_code == status.HTTP_200_OK
question = response.json()
for key in ["id", "quiz_id", "content", "type", "points"]:
assert question[key] == getattr(created_question, key)
for key in ["created_at", "updated_at"]:
assert question[key] == getattr(created_question, key).isoformat()
Instead of manually creating a question (which would require creating a quiz first) with some random values, we just use a QuestionFactory
. We could have manually done it in a reusable fixture, but we would still have more boilerplate code, it would be harder to generate custom values for specific test cases and the data wouldn't be as randomized for each test.
Dirty Equals
Dirty Equals is a library created by Samuel Colvin (creator of Pydantic) that "(mis)uses the __eq__
method to make Python code (generally unit tests) more declarative and therefore easier to read and write". It allows you to do stuff like this:
from dirty_equals import IsJson, IsNow, IsPositiveInt, IsStr
def test_user_endpoint(client: 'HttpClient', db_conn: 'Database'):
client.post('/users/create/', data=...)
user_data = db_conn.fetchrow('select * from users')
assert user_data == {
'id': IsPositiveInt,
'username': 'samuelcolvin',
'avatar_file': IsStr(regex=r'/[a-z0-9\-]{10}/example\.png'),
'settings_json': IsJson({'theme': 'dark', 'language': 'en'}),
'created_ts': IsNow(delta=3),
}
The IsNow
and IsDatetime
features (with iso_string=True
) can save you quite some time when writing test cases that, for example, need to check the creation or update time of something.
Poetry
Poetry is a tool for dependency management and packaging in Python, so be ready to say goodbye to pip
and requirements.txt
files. Poetry allows you to declare the libraries your project depends on and it will manage (install/update) them for you. It also offers a lockfile to ensure repeatable installs.
It not only installs dependencies but also manages them. Poetry resolves dependency conflicts and creates a pyproject.toml
file to track your project's dependencies along with their specific versions. You can also group the dependencies to distinguish between production/development/testing dependencies.
Poetry also automatically creates and manages virtual environments for your projects, ensuring that dependencies for different projects are isolated and don’t interfere with each other.
"Why Is Poetry Essential to the Modern Python Stack?" is a great article to understand the importance of using a tool like Poetry.
Additionally, Poetry's rich plugin ecosystem further enhances its utility. For instance, Poe the Poet, a task runner plugin that integrates seamlessly with Poetry, letting you define and run project tasks directly from your pyproject.toml
file. Imagine running your tests simply with poetry poe test
instead of the potentially longer pytest
command line (like pytest app/tests/ -v --cov --cov-report=term-missing
).
Ruff
Ruff is an emerging tool in the Python ecosystem that describes itself as "an extremely fast Python linter and code formatter, written in Rust".
Using a linter and a code formatter is essential in development for several reasons. A linter improves code quality by detecting errors and enforcing coding standards, while a code formatter ensures consistent styling across the codebase, enhancing readability and saving time on manual formatting
Ruff really delivers on its performance claims, since it is orders of magnitude faster than existing linters and code formatters. In my own experience, linting a large Python codebase at work, which usually takes about 20 seconds with flake8
, is done in less than a second with Ruff
.
Its adoption is growing rapidly, with major Python projects like FastAPI, Pandas, SciPy, and Airflow already incorporating Ruff into their development workflows.
Ruff is not only much faster, but it is also very convenient to have an all-in-one solution that replaces multiple other widely used tools: Flake8 (linter), isort (imports sorting), Black (code formatter), autoflake, many Flake8 plugins and more. And it has drop-in parity with these tools, so it is really straightforward to migrate from them to Ruff.
It also has a great Visual Studio Code extension that you should definitely install.
Timothy Crosley, the creator of isort, puts it succinctly:
Just switched my first project to Ruff. Only one downside so far: it's so fast I couldn't believe it was working till I intentionally introduced some errors.
Pre-commit Hooks
Pre-commit hooks act as the first line of defense in maintaining code quality, seamlessly integrating with linters and code formatters. They automatically execute these tools each time a developer tries to commit code to the repository, ensuring the code adheres to the project's standards. If the hooks detect issues, the commit is paused until the issues are resolved, guaranteeing that only code meeting quality standards makes it into the repository.
With Ruff as part of your pre-commit hooks, the checks are so swift that they hardly interrupt your coding flow, so you get the benefits of rigorous code quality checks without any noticeable delay in your normal workflow. Below is an example of a .pre-commit-config.yaml
file configured to run Ruff's linter and formatter:
repos:
# run the Ruff linter
- repo: https://github.com/astral-sh/ruff-pre-commit
# Ruff version
rev: v0.1.3
hooks:
- id: ruff
args: [--fix, --exit-non-zero-on-fix]
# run the Ruff formatter
- repo: https://github.com/astral-sh/ruff-pre-commit
# Ruff version
rev: v0.1.3
hooks:
- id: ruff-format
Mypy
Mypy is "an optional static type checker for Python that aims to combine the benefits of dynamic (or "duck") typing and static typing". As Python is dynamically typed, Mypy adds an extra layer of safety by checking types at compile time (based on type annotations conforming to PEP 484), catching potential errors before runtime.
Conclusion
In this post, we've covered some of the most powerful tools in modern Python backend development, from the rapid API construction with FastAPI and Pydantic to the efficient database handling with Async SQLAlchemy and Alembic. We explored how Pytest, augmented by Factory Boy and Dirty Equals, can refine testing, and how Poetry streamlines dependency management. Tools like Ruff and Mypy enhance code quality, while pre-commit hooks enforce these standards automatically. All of them collectively contributes to a more efficient, robust and maintainable development process.
Embracing these tools not only elevates the quality of your Python projects but also enhances your workflow, showing that the right set of tools can indeed make a world of difference in software development.
Top comments (2)
Thank you for this post! Actually Python shines at backend even more in serverless environment. I have plans to share with the community post very soon about this.
Thanks Vladimir! Looking forward to seeing what you come up with!