Learning Objectives
By the end of this module, you will:
- Master custom Python rules and macros
- Configure Python toolchains for different environments
- Implement hermetic Python builds
- Handle complex Python packaging scenarios
- Use aspects for code analysis and transformation
5.1 Custom Python Rules and Macros
Creating Custom Rules
Bazel's extensibility shines when you need custom build logic. Let's create a custom rule for Python code generation:
# //tools/python_rules.bzl
def _python_proto_library_impl(ctx):
"""Implementation of python_proto_library rule."""
proto_files = ctx.files.srcs
output_dir = ctx.actions.declare_directory("proto_gen")
# Generate Python files from proto
ctx.actions.run(
inputs = proto_files,
outputs = [output_dir],
executable = ctx.executable._protoc,
arguments = [
"--python_out=" + output_dir.path,
"--proto_path=" + ctx.files.srcs[0].dirname,
] + [f.path for f in proto_files],
mnemonic = "ProtocPython",
)
return [
DefaultInfo(files = depset([output_dir])),
PyInfo(
transitive_sources = depset([output_dir]),
imports = depset([output_dir.path]),
),
]
python_proto_library = rule(
implementation = _python_proto_library_impl,
attrs = {
"srcs": attr.label_list(
allow_files = [".proto"],
mandatory = True,
),
"_protoc": attr.label(
default = "@com_google_protobuf//:protoc",
executable = True,
cfg = "exec",
),
},
)
Advanced Macros
Create sophisticated macros that generate multiple targets:
# //tools/python_macros.bzl
def python_microservice(name, srcs, deps = [], **kwargs):
"""Macro for creating a complete Python microservice."""
# Main library
native.py_library(
name = name + "_lib",
srcs = srcs,
deps = deps,
**kwargs
)
# Binary
native.py_binary(
name = name,
srcs = [name + "_main.py"],
deps = [":" + name + "_lib"],
main = name + "_main.py",
)
# Tests
test_srcs = native.glob([name + "_*test.py"])
if test_srcs:
native.py_test(
name = name + "_test",
srcs = test_srcs,
deps = [
":" + name + "_lib",
"@pytest",
],
)
# Docker image
native.genrule(
name = name + "_docker",
srcs = [":" + name],
outs = [name + ".tar"],
cmd = """
docker build -t {name}:latest - <<EOF
FROM python:3.11-slim
COPY $(location :{name}) /app/
WORKDIR /app
ENTRYPOINT ["python", "{name}"]
EOF
docker save {name}:latest > $@
""".format(name = name),
)
5.2 Python Toolchains
Configuring Multiple Python Versions
Set up toolchains for different Python versions:
# //tools/python_toolchain/BUILD
load("@rules_python//python:defs.bzl", "py_runtime", "py_runtime_pair")
py_runtime(
name = "python38_runtime",
interpreter_path = "/usr/bin/python3.8",
python_version = "PY3",
)
py_runtime(
name = "python39_runtime",
interpreter_path = "/usr/bin/python3.9",
python_version = "PY3",
)
py_runtime(
name = "python311_runtime",
interpreter_path = "/usr/bin/python3.11",
python_version = "PY3",
)
py_runtime_pair(
name = "python38_runtime_pair",
py3_runtime = ":python38_runtime",
)
py_runtime_pair(
name = "python39_runtime_pair",
py3_runtime = ":python39_runtime",
)
py_runtime_pair(
name = "python311_runtime_pair",
py3_runtime = ":python311_runtime",
)
toolchain(
name = "python38_toolchain",
toolchain = ":python38_runtime_pair",
toolchain_type = "@rules_python//python:toolchain_type",
)
toolchain(
name = "python39_toolchain",
toolchain = ":python39_runtime_pair",
toolchain_type = "@rules_python//python:toolchain_type",
)
toolchain(
name = "python311_toolchain",
toolchain = ":python311_runtime_pair",
toolchain_type = "@rules_python//python:toolchain_type",
)
Hermetic Python Builds
Create completely hermetic builds with embedded interpreters:
# WORKSPACE
load("@rules_python//python:repositories.bzl", "python_register_toolchains")
python_register_toolchains(
name = "python_3_11",
python_version = "3.11.4",
# This makes the Python interpreter hermetic
ignore_root_user_error = True,
)
load("@python_3_11//:defs.bzl", "interpreter")
5.3 Advanced Dependency Management
Custom Repository Rules
Create custom repository rules for complex dependency scenarios:
# //tools/custom_repos.bzl
def _python_wheel_impl(repository_ctx):
"""Download and extract a Python wheel."""
url = repository_ctx.attr.url
sha256 = repository_ctx.attr.sha256
repository_ctx.download_and_extract(
url = url,
sha256 = sha256,
stripPrefix = repository_ctx.attr.strip_prefix,
)
# Generate BUILD file
repository_ctx.file("BUILD", """
load("@rules_python//python:defs.bzl", "py_library")
py_library(
name = "pkg",
srcs = glob(["**/*.py"]),
data = glob(["**/*"], exclude=["**/*.py", "BUILD"]),
visibility = ["//visibility:public"],
)
""")
python_wheel = repository_rule(
implementation = _python_wheel_impl,
attrs = {
"url": attr.string(mandatory = True),
"sha256": attr.string(mandatory = True),
"strip_prefix": attr.string(default = ""),
},
)
Version Pinning and Lock Files
Implement sophisticated version management:
# //third_party/python/requirements.bzl
def install_python_deps():
"""Install all Python dependencies with exact versions."""
# Auto-generated from requirements.lock
pip_install(
name = "pypi_deps",
requirements_lock = "//third_party/python:requirements.lock",
python_interpreter_target = "@python_3_11_host//:python",
)
# Custom wheels
python_wheel(
name = "custom_ml_lib",
url = "https://files.pythonhosted.org/packages/.../custom_ml_lib-1.0.0-py3-none-any.whl",
sha256 = "abc123...",
)
5.4 Code Generation and Aspects
Protocol Buffer Integration
Advanced protobuf handling with custom rules:
# //proto/BUILD
load("//tools:python_rules.bzl", "python_proto_library")
proto_library(
name = "api_proto",
srcs = ["api.proto"],
)
python_proto_library(
name = "api_py_proto",
srcs = [":api_proto"],
)
py_library(
name = "api_client",
srcs = ["api_client.py"],
deps = [":api_py_proto"],
)
Using Aspects for Analysis
Create aspects for code analysis and transformation:
# //tools/analysis.bzl
def _python_coverage_aspect_impl(target, ctx):
"""Collect Python source files for coverage analysis."""
if PyInfo not in target:
return []
py_info = target[PyInfo]
source_files = py_info.transitive_sources.to_list()
coverage_file = ctx.actions.declare_file(target.label.name + ".coverage")
ctx.actions.write(
output = coverage_file,
content = "\n".join([f.path for f in source_files if f.path.endswith(".py")]),
)
return [OutputGroupInfo(coverage_files = depset([coverage_file]))]
python_coverage_aspect = aspect(
implementation = _python_coverage_aspect_impl,
attr_aspects = ["deps"],
)
5.5 Performance Optimization
Build Performance Tuning
Optimize build performance with advanced configurations:
# .bazelrc.performance
# Enable persistent workers for Python tools
build --worker_sandboxing=false
build --experimental_worker_multiplex
# Optimize Python rule execution
build --experimental_python_import_all_repositories
# Use faster Python stub generation
build --experimental_python_stub_imports
# Memory optimization
build --experimental_worker_memory_limit_mb=2048
# Remote execution optimization
build:remote --experimental_remote_merkle_tree_cache
build:remote --experimental_remote_cache_compression
Incremental Build Optimization
Configure rules for optimal incremental builds:
# //tools/optimized_rules.bzl
def optimized_py_library(name, srcs, deps = [], **kwargs):
"""Optimized Python library with better incremental builds."""
# Separate interface and implementation
native.py_library(
name = name + "_interface",
srcs = [s for s in srcs if s.endswith("_interface.py")],
**kwargs
)
native.py_library(
name = name,
srcs = [s for s in srcs if not s.endswith("_interface.py")],
deps = deps + [":" + name + "_interface"],
**kwargs
)
5.6 Practical Exercise: ML Pipeline
Let's build a complete machine learning pipeline demonstrating advanced concepts:
# //ml_pipeline/BUILD
load("//tools:python_macros.bzl", "python_microservice")
load("//tools:python_rules.bzl", "python_proto_library")
# Data schema
proto_library(
name = "schema_proto",
srcs = ["schema.proto"],
)
python_proto_library(
name = "schema_py_proto",
srcs = [":schema_proto"],
)
# Data processing service
python_microservice(
name = "data_processor",
srcs = [
"data_processor.py",
"data_utils.py",
],
deps = [
":schema_py_proto",
"@pypi_deps//pandas",
"@pypi_deps//numpy",
],
)
# Model training service
python_microservice(
name = "model_trainer",
srcs = [
"model_trainer.py",
"model_utils.py",
],
deps = [
":schema_py_proto",
"@pypi_deps//scikit_learn",
"@pypi_deps//joblib",
],
)
# Model serving service
python_microservice(
name = "model_server",
srcs = [
"model_server.py",
"serving_utils.py",
],
deps = [
":schema_py_proto",
"@pypi_deps//flask",
"@pypi_deps//joblib",
],
)
# Integration test
py_test(
name = "pipeline_integration_test",
srcs = ["pipeline_integration_test.py"],
deps = [
":data_processor_lib",
":model_trainer_lib",
":model_server_lib",
"@pypi_deps//pytest",
"@pypi_deps//requests",
],
data = [
"test_data.csv",
"expected_model.pkl",
],
)
5.7 Best Practices Summary
Rule Design Principles
- Keep rules focused and composable
- Use proper input/output declarations
- Implement hermetic execution
- Provide clear error messages
- Document rule attributes and behavior
Toolchain Management
- Use hermetic toolchains when possible
- Version pin all dependencies
- Test with multiple Python versions
- Implement proper toolchain selection
Performance Considerations
- Minimize rule overhead
- Use aspects judiciously
- Optimize for incremental builds
- Profile build performance regularly
Module 5 Exercises
Exercise 1: Custom Rule
Create a custom rule that generates Python dataclasses from JSON schema files.
Exercise 2: Multi-Version Testing
Set up a test matrix that runs your tests against Python 3.8, 3.9, and 3.11.
Exercise 3: Aspect Implementation
Write an aspect that collects all Python import statements across your build graph.
Exercise 4: Performance Analysis
Profile your build and identify the top 3 bottlenecks, then implement optimizations.
Next Steps
In Module 6, we'll cover "Production Deployment and CI/CD Integration" where you'll learn to:
- Set up remote caching and execution
- Integrate with CI/CD systems
- Implement automated testing pipelines
- Deploy applications using Bazel
Key Takeaways
- Custom rules and macros provide powerful extensibility
- Toolchains enable hermetic, reproducible builds
- Aspects offer cross-cutting analysis capabilities
- Performance optimization requires systematic profiling
- Advanced dependency management prevents version conflicts
https://www.linkedin.com/in/sushilbaligar/
https://github.com/sushilbaligar
https://dev.to/sushilbaligar
https://medium.com/@sushilbaligar
Top comments (0)