DEV Community

Cover image for Modern Bazel with Python-Module 5: Advanced Python Rules and Toolchains
Sushil Baligar
Sushil Baligar

Posted on

Modern Bazel with Python-Module 5: Advanced Python Rules and Toolchains

Learning Objectives

By the end of this module, you will:

  • Master custom Python rules and macros
  • Configure Python toolchains for different environments
  • Implement hermetic Python builds
  • Handle complex Python packaging scenarios
  • Use aspects for code analysis and transformation

5.1 Custom Python Rules and Macros

Creating Custom Rules

Bazel's extensibility shines when you need custom build logic. Let's create a custom rule for Python code generation:

# //tools/python_rules.bzl
def _python_proto_library_impl(ctx):
    """Implementation of python_proto_library rule."""
    proto_files = ctx.files.srcs
    output_dir = ctx.actions.declare_directory("proto_gen")

    # Generate Python files from proto
    ctx.actions.run(
        inputs = proto_files,
        outputs = [output_dir],
        executable = ctx.executable._protoc,
        arguments = [
            "--python_out=" + output_dir.path,
            "--proto_path=" + ctx.files.srcs[0].dirname,
        ] + [f.path for f in proto_files],
        mnemonic = "ProtocPython",
    )

    return [
        DefaultInfo(files = depset([output_dir])),
        PyInfo(
            transitive_sources = depset([output_dir]),
            imports = depset([output_dir.path]),
        ),
    ]

python_proto_library = rule(
    implementation = _python_proto_library_impl,
    attrs = {
        "srcs": attr.label_list(
            allow_files = [".proto"],
            mandatory = True,
        ),
        "_protoc": attr.label(
            default = "@com_google_protobuf//:protoc",
            executable = True,
            cfg = "exec",
        ),
    },
)
Enter fullscreen mode Exit fullscreen mode

Advanced Macros

Create sophisticated macros that generate multiple targets:

# //tools/python_macros.bzl
def python_microservice(name, srcs, deps = [], **kwargs):
    """Macro for creating a complete Python microservice."""

    # Main library
    native.py_library(
        name = name + "_lib",
        srcs = srcs,
        deps = deps,
        **kwargs
    )

    # Binary
    native.py_binary(
        name = name,
        srcs = [name + "_main.py"],
        deps = [":" + name + "_lib"],
        main = name + "_main.py",
    )

    # Tests
    test_srcs = native.glob([name + "_*test.py"])
    if test_srcs:
        native.py_test(
            name = name + "_test",
            srcs = test_srcs,
            deps = [
                ":" + name + "_lib",
                "@pytest",
            ],
        )

    # Docker image
    native.genrule(
        name = name + "_docker",
        srcs = [":" + name],
        outs = [name + ".tar"],
        cmd = """
        docker build -t {name}:latest - <<EOF
FROM python:3.11-slim
COPY $(location :{name}) /app/
WORKDIR /app
ENTRYPOINT ["python", "{name}"]
EOF
        docker save {name}:latest > $@
        """.format(name = name),
    )
Enter fullscreen mode Exit fullscreen mode

5.2 Python Toolchains

Configuring Multiple Python Versions

Set up toolchains for different Python versions:

# //tools/python_toolchain/BUILD
load("@rules_python//python:defs.bzl", "py_runtime", "py_runtime_pair")

py_runtime(
    name = "python38_runtime",
    interpreter_path = "/usr/bin/python3.8",
    python_version = "PY3",
)

py_runtime(
    name = "python39_runtime", 
    interpreter_path = "/usr/bin/python3.9",
    python_version = "PY3",
)

py_runtime(
    name = "python311_runtime",
    interpreter_path = "/usr/bin/python3.11", 
    python_version = "PY3",
)

py_runtime_pair(
    name = "python38_runtime_pair",
    py3_runtime = ":python38_runtime",
)

py_runtime_pair(
    name = "python39_runtime_pair",
    py3_runtime = ":python39_runtime", 
)

py_runtime_pair(
    name = "python311_runtime_pair",
    py3_runtime = ":python311_runtime",
)

toolchain(
    name = "python38_toolchain",
    toolchain = ":python38_runtime_pair",
    toolchain_type = "@rules_python//python:toolchain_type",
)

toolchain(
    name = "python39_toolchain", 
    toolchain = ":python39_runtime_pair",
    toolchain_type = "@rules_python//python:toolchain_type",
)

toolchain(
    name = "python311_toolchain",
    toolchain = ":python311_runtime_pair", 
    toolchain_type = "@rules_python//python:toolchain_type",
)
Enter fullscreen mode Exit fullscreen mode

Hermetic Python Builds

Create completely hermetic builds with embedded interpreters:

# WORKSPACE
load("@rules_python//python:repositories.bzl", "python_register_toolchains")

python_register_toolchains(
    name = "python_3_11",
    python_version = "3.11.4",
    # This makes the Python interpreter hermetic
    ignore_root_user_error = True,
)

load("@python_3_11//:defs.bzl", "interpreter")
Enter fullscreen mode Exit fullscreen mode

5.3 Advanced Dependency Management

Custom Repository Rules

Create custom repository rules for complex dependency scenarios:

# //tools/custom_repos.bzl
def _python_wheel_impl(repository_ctx):
    """Download and extract a Python wheel."""
    url = repository_ctx.attr.url
    sha256 = repository_ctx.attr.sha256

    repository_ctx.download_and_extract(
        url = url,
        sha256 = sha256,
        stripPrefix = repository_ctx.attr.strip_prefix,
    )

    # Generate BUILD file
    repository_ctx.file("BUILD", """
load("@rules_python//python:defs.bzl", "py_library")

py_library(
    name = "pkg",
    srcs = glob(["**/*.py"]),
    data = glob(["**/*"], exclude=["**/*.py", "BUILD"]),
    visibility = ["//visibility:public"],
)
""")

python_wheel = repository_rule(
    implementation = _python_wheel_impl,
    attrs = {
        "url": attr.string(mandatory = True),
        "sha256": attr.string(mandatory = True), 
        "strip_prefix": attr.string(default = ""),
    },
)
Enter fullscreen mode Exit fullscreen mode

Version Pinning and Lock Files

Implement sophisticated version management:

# //third_party/python/requirements.bzl
def install_python_deps():
    """Install all Python dependencies with exact versions."""

    # Auto-generated from requirements.lock
    pip_install(
        name = "pypi_deps",
        requirements_lock = "//third_party/python:requirements.lock",
        python_interpreter_target = "@python_3_11_host//:python",
    )

    # Custom wheels
    python_wheel(
        name = "custom_ml_lib",
        url = "https://files.pythonhosted.org/packages/.../custom_ml_lib-1.0.0-py3-none-any.whl",
        sha256 = "abc123...",
    )
Enter fullscreen mode Exit fullscreen mode

5.4 Code Generation and Aspects

Protocol Buffer Integration

Advanced protobuf handling with custom rules:

# //proto/BUILD
load("//tools:python_rules.bzl", "python_proto_library")

proto_library(
    name = "api_proto",
    srcs = ["api.proto"],
)

python_proto_library(
    name = "api_py_proto",
    srcs = [":api_proto"],
)

py_library(
    name = "api_client",
    srcs = ["api_client.py"],
    deps = [":api_py_proto"],
)
Enter fullscreen mode Exit fullscreen mode

Using Aspects for Analysis

Create aspects for code analysis and transformation:

# //tools/analysis.bzl
def _python_coverage_aspect_impl(target, ctx):
    """Collect Python source files for coverage analysis."""
    if PyInfo not in target:
        return []

    py_info = target[PyInfo]
    source_files = py_info.transitive_sources.to_list()

    coverage_file = ctx.actions.declare_file(target.label.name + ".coverage")

    ctx.actions.write(
        output = coverage_file,
        content = "\n".join([f.path for f in source_files if f.path.endswith(".py")]),
    )

    return [OutputGroupInfo(coverage_files = depset([coverage_file]))]

python_coverage_aspect = aspect(
    implementation = _python_coverage_aspect_impl,
    attr_aspects = ["deps"],
)
Enter fullscreen mode Exit fullscreen mode

5.5 Performance Optimization

Build Performance Tuning

Optimize build performance with advanced configurations:

# .bazelrc.performance
# Enable persistent workers for Python tools
build --worker_sandboxing=false
build --experimental_worker_multiplex

# Optimize Python rule execution
build --experimental_python_import_all_repositories

# Use faster Python stub generation
build --experimental_python_stub_imports

# Memory optimization
build --experimental_worker_memory_limit_mb=2048

# Remote execution optimization
build:remote --experimental_remote_merkle_tree_cache
build:remote --experimental_remote_cache_compression
Enter fullscreen mode Exit fullscreen mode

Incremental Build Optimization

Configure rules for optimal incremental builds:

# //tools/optimized_rules.bzl
def optimized_py_library(name, srcs, deps = [], **kwargs):
    """Optimized Python library with better incremental builds."""

    # Separate interface and implementation
    native.py_library(
        name = name + "_interface",
        srcs = [s for s in srcs if s.endswith("_interface.py")],
        **kwargs
    )

    native.py_library(
        name = name,
        srcs = [s for s in srcs if not s.endswith("_interface.py")],
        deps = deps + [":" + name + "_interface"],
        **kwargs
    )
Enter fullscreen mode Exit fullscreen mode

5.6 Practical Exercise: ML Pipeline

Let's build a complete machine learning pipeline demonstrating advanced concepts:

# //ml_pipeline/BUILD
load("//tools:python_macros.bzl", "python_microservice")
load("//tools:python_rules.bzl", "python_proto_library")

# Data schema
proto_library(
    name = "schema_proto",
    srcs = ["schema.proto"],
)

python_proto_library(
    name = "schema_py_proto", 
    srcs = [":schema_proto"],
)

# Data processing service
python_microservice(
    name = "data_processor",
    srcs = [
        "data_processor.py",
        "data_utils.py",
    ],
    deps = [
        ":schema_py_proto",
        "@pypi_deps//pandas",
        "@pypi_deps//numpy",
    ],
)

# Model training service  
python_microservice(
    name = "model_trainer",
    srcs = [
        "model_trainer.py",
        "model_utils.py",
    ],
    deps = [
        ":schema_py_proto",
        "@pypi_deps//scikit_learn", 
        "@pypi_deps//joblib",
    ],
)

# Model serving service
python_microservice(
    name = "model_server",
    srcs = [
        "model_server.py",
        "serving_utils.py", 
    ],
    deps = [
        ":schema_py_proto",
        "@pypi_deps//flask",
        "@pypi_deps//joblib",
    ],
)

# Integration test
py_test(
    name = "pipeline_integration_test",
    srcs = ["pipeline_integration_test.py"],
    deps = [
        ":data_processor_lib",
        ":model_trainer_lib", 
        ":model_server_lib",
        "@pypi_deps//pytest",
        "@pypi_deps//requests",
    ],
    data = [
        "test_data.csv",
        "expected_model.pkl",
    ],
)
Enter fullscreen mode Exit fullscreen mode

5.7 Best Practices Summary

Rule Design Principles

  • Keep rules focused and composable
  • Use proper input/output declarations
  • Implement hermetic execution
  • Provide clear error messages
  • Document rule attributes and behavior

Toolchain Management

  • Use hermetic toolchains when possible
  • Version pin all dependencies
  • Test with multiple Python versions
  • Implement proper toolchain selection

Performance Considerations

  • Minimize rule overhead
  • Use aspects judiciously
  • Optimize for incremental builds
  • Profile build performance regularly

Module 5 Exercises

Exercise 1: Custom Rule

Create a custom rule that generates Python dataclasses from JSON schema files.

Exercise 2: Multi-Version Testing

Set up a test matrix that runs your tests against Python 3.8, 3.9, and 3.11.

Exercise 3: Aspect Implementation

Write an aspect that collects all Python import statements across your build graph.

Exercise 4: Performance Analysis

Profile your build and identify the top 3 bottlenecks, then implement optimizations.

Next Steps

In Module 6, we'll cover "Production Deployment and CI/CD Integration" where you'll learn to:

  • Set up remote caching and execution
  • Integrate with CI/CD systems
  • Implement automated testing pipelines
  • Deploy applications using Bazel

Key Takeaways

  • Custom rules and macros provide powerful extensibility
  • Toolchains enable hermetic, reproducible builds
  • Aspects offer cross-cutting analysis capabilities
  • Performance optimization requires systematic profiling
  • Advanced dependency management prevents version conflicts

https://www.linkedin.com/in/sushilbaligar/
https://github.com/sushilbaligar
https://dev.to/sushilbaligar
https://medium.com/@sushilbaligar

Top comments (0)