DEV Community

Thesius Code
Thesius Code

Posted on • Originally published at datanest-stores.pages.dev

Hyperparameter Tuning Kit

Hyperparameter Tuning Kit

Production-ready hyperparameter optimization with Optuna and Ray Tune. Define search spaces declaratively, run distributed sweeps, and find optimal configurations faster with intelligent pruning and early stopping.

Key Features

  • Optuna integration — TPE, CMA-ES, and grid search with Bayesian optimization out of the box
  • Ray Tune configs — distributed hyperparameter search across multiple machines and GPUs
  • Smart pruning — Median, Hyperband, and ASHA pruners to kill underperforming trials early
  • Declarative search spaces — define search spaces in YAML, not scattered through code
  • Multi-objective optimization — optimize accuracy AND latency simultaneously with Pareto fronts
  • Visualization dashboards — parameter importance plots, optimization history, contour maps
  • Experiment resumption — persistent storage backends so sweeps survive restarts
  • Sklearn + PyTorch examples — complete tuning scripts for both frameworks

Quick Start

# 1. Copy the config
cp config.example.yaml config.yaml

# 2. Run a quick Optuna study
python examples/optuna_basic.py

# 3. View results in the Optuna dashboard
optuna-dashboard sqlite:///optuna_studies.db
Enter fullscreen mode Exit fullscreen mode
"""Tune a RandomForest with Optuna in 20 lines."""
import optuna
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_val_score
from sklearn.datasets import make_classification

X, y = make_classification(n_samples=1000, n_features=20, random_state=42)

def objective(trial: optuna.Trial) -> float:
    params = {
        "n_estimators": trial.suggest_int("n_estimators", 50, 500),
        "max_depth": trial.suggest_int("max_depth", 3, 30),
        "min_samples_split": trial.suggest_int("min_samples_split", 2, 20),
        "min_samples_leaf": trial.suggest_int("min_samples_leaf", 1, 10),
        "max_features": trial.suggest_categorical("max_features", ["sqrt", "log2"]),
    }

    clf = RandomForestClassifier(**params, random_state=42, n_jobs=-1)
    score = cross_val_score(clf, X, y, cv=5, scoring="accuracy").mean()
    return score

study = optuna.create_study(direction="maximize", study_name="rf-tuning")
study.optimize(objective, n_trials=100, show_progress_bar=True)

print(f"Best accuracy: {study.best_value:.4f}")
print(f"Best params: {study.best_params}")
Enter fullscreen mode Exit fullscreen mode

Architecture

hyperparameter-tuning-kit/
├── config.example.yaml          # Search space and tuning configuration
├── templates/
│   ├── search_spaces.py         # Declarative search space definitions
│   ├── optuna_tuner.py          # Optuna study wrapper with pruning
│   ├── ray_tuner.py             # Ray Tune scheduler and search configs
│   ├── pruners.py               # Pruning strategy implementations
│   └── visualization.py         # Result plotting utilities
├── docs/
│   └── overview.md
└── examples/
    ├── optuna_basic.py          # Single-objective sklearn tuning
    ├── optuna_pytorch.py        # PyTorch training loop with pruning
    ├── ray_distributed.py       # Multi-node distributed tuning
    └── multi_objective.py       # Pareto-optimal search
Enter fullscreen mode Exit fullscreen mode

Usage Examples

PyTorch with Optuna Pruning

import optuna
import torch
import torch.nn as nn

def objective(trial: optuna.Trial) -> float:
    lr = trial.suggest_float("lr", 1e-5, 1e-1, log=True)
    hidden_size = trial.suggest_int("hidden_size", 32, 512)
    dropout = trial.suggest_float("dropout", 0.1, 0.5)

    model = nn.Sequential(
        nn.Linear(784, hidden_size),
        nn.ReLU(),
        nn.Dropout(dropout),
        nn.Linear(hidden_size, 10),
    ).to("cuda")

    optimizer = torch.optim.Adam(model.parameters(), lr=lr)

    for epoch in range(50):
        train_loss = train_one_epoch(model, optimizer, train_loader)
        val_acc = evaluate(model, val_loader)

        # Report intermediate value for pruning
        trial.report(val_acc, epoch)
        if trial.should_prune():
            raise optuna.TrialPruned()

    return val_acc

study = optuna.create_study(
    direction="maximize",
    pruner=optuna.pruners.HyperbandPruner(max_resource=50),
    storage="sqlite:///optuna_studies.db",
)
study.optimize(objective, n_trials=200)
Enter fullscreen mode Exit fullscreen mode

Configuration

# config.example.yaml
study:
  name: "model-optimization"
  direction: "maximize"            # maximize | minimize
  storage: "sqlite:///optuna_studies.db"
  n_trials: 200

search_space:
  learning_rate: { type: float, low: 1e-5, high: 1e-1, log: true }
  batch_size: { type: categorical, choices: [16, 32, 64, 128] }
  hidden_size: { type: int, low: 32, high: 512, step: 32 }

pruning:
  strategy: "hyperband"            # median | hyperband | asha
  max_resource: 50
  reduction_factor: 3

distributed:
  n_jobs: 4                        # Parallel trial workers
  backend: "optuna"                # optuna | ray
Enter fullscreen mode Exit fullscreen mode

Best Practices

  1. Start with TPE (default), not grid search — Bayesian optimization finds good regions in far fewer trials
  2. Always enable pruning — Hyperband pruner saves 50-70% of compute by stopping bad trials early
  3. Use log-uniform for learning ratessuggest_float("lr", 1e-5, 1e-1, log=True) samples evenly across magnitudes
  4. Persist studies to a database — use SQLite or PostgreSQL storage so you can resume after interruptions
  5. Run parameter importance analysisoptuna.importance.get_param_importances(study) tells you which params actually matter

Troubleshooting

Problem Cause Fix
All trials pruned Pruner too aggressive or metric reported incorrectly Use min_resource=10 to let trials warm up before pruning
Study not resumable In-memory storage (default) Set storage="sqlite:///study.db" in create_study()
Duplicate parameter suggestions Small search space exhausted Widen ranges or switch from grid to TPE sampler
Parallel trials return same params Default sampler not multi-worker aware Use optuna.samplers.TPESampler(multivariate=True, n_startup_trials=20)

This is 1 of 10 resources in the ML Starter Kit toolkit. Get the complete [Hyperparameter Tuning Kit] with all files, templates, and documentation for $29.

Get the Full Kit →

Or grab the entire ML Starter Kit bundle (10 products) for $149 — save 30%.

Get the Complete Bundle →


Related Articles

Top comments (0)