DEV Community

Chandrashekhar Kachawa
Chandrashekhar Kachawa

Posted on • Originally published at ctrix.pro

Mastering Python Monorepos: A Practical Guide

As projects grow, managing dependencies and shared code across multiple repositories can become a significant challenge. Code gets duplicated, dependency versions drift apart, and coordinating changes becomes a complex dance. A monorepo—a single repository containing multiple distinct projects—offers a powerful solution to these problems.

This guide will walk you through building a scalable Python monorepo, integrating different services like FastAPI and Apache Airflow, and maintaining high code quality with modern tools like Ruff.

Why a Monorepo?

Before we dive in, let's clarify the benefits:

  • Simplified Dependency Management: A single set of dependencies at the root level (or well-defined per-project dependencies) prevents version conflicts.
  • Atomic Commits: Changes across multiple services can be made in a single commit, ensuring consistency and simplifying rollbacks.
  • Seamless Code Sharing: Reusing code between projects is as simple as a local import, eliminating the need for a separate package registry.
  • Unified Tooling: A single configuration for linting, formatting, and testing can be enforced across the entire codebase.

Designing the Monorepo Structure

A clean structure is key to a maintainable monorepo. Here’s a proven layout:

my-monorepo/
├── .gitignore
├── pyproject.toml      # Root config for dependencies and tools (Ruff)
├── packages/
│   └── shared_library/
│       ├── pyproject.toml  # Makes this an installable package
│       └── src/
│           └── shared_library/
│               ├── __init__.py
│               └── core.py       # Shared business logic
└── services/
    ├── fastapi_app/
    │   ├── pyproject.toml
    │   └── src/
    │       └── fastapi_app/
    │           └── main.py
    └── airflow_dags/
        ├── pyproject.toml
        └── src/
            └── airflow_dags/
                └── example_dag.py
Enter fullscreen mode Exit fullscreen mode
  • packages/: Contains local Python packages intended to be shared across different services.
  • services/: Contains the actual applications, like your FastAPI microservice or Airflow DAGs.
  • pyproject.toml: The root pyproject.toml is crucial for centralizing tool configuration and, optionally, base dependencies.

Creating a Shared Package

The magic of a monorepo lies in its ability to share code effortlessly. Let's make shared_library an installable package.

packages/shared_library/pyproject.toml

[project]
name = "shared_library"
version = "0.1.0"
Enter fullscreen mode Exit fullscreen mode

packages/shared_library/src/shared_library/core.py

# Example shared function
def get_greeting() -> str:
    return "Hello from the shared library!"
Enter fullscreen mode Exit fullscreen mode

Now, any service in your monorepo can install this package in editable mode. This means changes to the library code are immediately reflected in the services that use it, without needing a re-installation.

From the root of the monorepo, you would typically install this into the virtual environment of a service that needs it:

# Example: Installing for the fastapi_app
pip install -e packages/shared_library
Enter fullscreen mode Exit fullscreen mode

Integrating Your Services

Let's see how our FastAPI app and Airflow DAGs can use the shared_library.

1. FastAPI Microservice

Your FastAPI app can now import directly from the shared code.

services/fastapi_app/src/fastapi_app/main.py

from fastapi import FastAPI
from shared_library.core import get_greeting

app = FastAPI()

@app.get("/")
def read_root():
    greeting = get_greeting()
    return {"message": greeting}
Enter fullscreen mode Exit fullscreen mode

When you run this FastAPI application, it will execute the code from shared_library as if it were a standard installed package.

2. Apache Airflow DAGs

Similarly, your Airflow DAGs can pull in shared logic for tasks, connections, or configurations.

services/airflow_dags/src/airflow_dags/example_dag.py

from __future__ import annotations

import pendulum

from airflow.models.dag import DAG
from airflow.operators.python import PythonOperator
from shared_library.core import get_greeting

def print_greeting():
    """A simple Python function to be executed by Airflow."""
    message = get_greeting()
    print(message)

with DAG(
    dag_id="example_shared_library_dag",
    start_date=pendulum.datetime(2023, 1, 1, tz="UTC"),
    catchup=False,
    schedule=None,
    tags=["example"],
) as dag:
    PythonOperator(
        task_id="print_shared_greeting",
        python_callable=print_greeting,
    )
Enter fullscreen mode Exit fullscreen mode

For Airflow to find your shared_library, you must ensure it's installed in the Python environment where your Airflow scheduler and workers are running.

Tooling and Best Practices with Ruff

A monorepo is the perfect environment for standardized tooling. Ruff is an incredibly fast linter and formatter that can be configured at the root of your project.

In your root pyproject.toml, add the following configuration:

[tool.ruff]
# Enable Pyflakes, Pycodestyle, and isort rules
select = ["E", "F", "I"]
line-length = 88

[tool.ruff.lint]
# Define source directories for Ruff to check
include = [
    "packages/shared_library/src/**/*.py",
    "services/fastapi_app/src/**/*.py",
    "services/airflow_dags/src/**/*.py",
]

[tool.ruff.format]
# Use black-compatible formatting
quote-style = "double"
Enter fullscreen mode Exit fullscreen mode

Now, you can run Ruff from the root of your monorepo to check and format your entire codebase with a single command:

# Check for linting errors
ruff check .

# Format all files
ruff format .
Enter fullscreen mode Exit fullscreen mode

This ensures that all your Python code, regardless of which service it belongs to, adheres to the same quality standards.

Conclusion

Adopting a Python monorepo requires a shift in mindset but offers immense rewards in terms of maintainability, consistency, and development speed. By creating a logical structure with shared packages and enforcing unified tooling with Ruff, you can build a robust foundation that scales with your projects. This architecture empowers you to manage complex systems like co-dependent microservices and data pipelines with greater confidence and efficiency.

Top comments (0)