DEV Community

Thisal Dilmith
Thisal Dilmith

Posted on

Why GridSearchCV Wastes Most of Its Time — And What I Did About It

If you've ever tuned hyperparameters on a large grid, you know the pain. You kick off a GridSearchCV, go make coffee, come back, and it's still running. Maybe you go to lunch. Maybe it's still running.

I got frustrated enough to build something different.


The Problem with GridSearchCV

GridSearchCV is brute force by design. For a grid with k parameters and n values each, it evaluates nᵏ × cv_folds configurations — every single one, regardless of how poorly a value performs early on.

Here's what that actually means in practice:

Problem Impact
Dead-end values are never discarded A bad learning_rate=0.5 is re-evaluated in every downstream combination
No learning from early results The search treats round 1 and round 1000 as equally uninformed
Exponential cost scaling Adding one new 4-value parameter can quadruple total training time

The last point is the killer. Your grid doesn't have to be huge for this to hurt — it just has to grow.


The Idea: Eliminate Instead of Enumerate

What if instead of evaluating everything upfront, we tested parameter values in rounds — and dropped the bad ones before they compound?

That's EliminationSearchCV.

It works like this:

  • Round 1: Test each parameter value in isolation. Score them per-parameter and eliminate the worst performers.
  • Round 2: Test surviving pairs. Rank all combinations globally, keep the top fraction.
  • Round 3+: Repeat with triples, then full combinations, until one winner remains.

Bad values get cut early. They never get the chance to multiply into thousands of useless combinations.


A Concrete Example

Let's tune a LogisticRegression with 4 parameters:

param_grid = {
    'C':        [0.001, 0.01, 0.1, 1, 10, 100],   # 6 values
    'penalty':  ['l1', 'l2'],                       # 2 values
    'solver':   ['liblinear', 'saga'],              # 2 values
    'max_iter': [1000, 2000],                       # 2 values
}
# GridSearchCV: 6 × 2 × 2 × 2 = 48 combos × 5 folds = 240 fits
Enter fullscreen mode Exit fullscreen mode

With EliminationSearchCV and elimination_rate=0.8 (keep best 20%):

Round Combos tested Grid after elimination
1 — single params 12 C:[1], penalty:['l1'], solver:['liblinear'], max_iter:[1000]
2 — pairs 6 unchanged (already 1 value each)
3 — triples 4 unchanged
4 — full 1 final result
Total 23 fits vs 240 for GridSearchCV

Same best params. A fraction of the work.


Drop-in Replacement

The API is intentionally identical to GridSearchCV:

from EliminationSearchCV import EliminationSearchCV

# Before
search = GridSearchCV(model, param_grid, cv=5)

# After — just swap the class name
search = EliminationSearchCV(
    estimator=model,
    param_grid=param_grid,
    scoring='accuracy',
    cv=5,
    elimination_rate=0.8,  # eliminate worst 80% each round
)

search.fit(X_train, y_train)

# Same interface as GridSearchCV
print(search.best_params_)
# → {'C': 1, 'penalty': 'l1', 'solver': 'liblinear', 'max_iter': 1000}

print(search.best_score_)
# → 0.9248

# Already refitted on full training set — ready to predict
search.best_estimator_.predict(X_test)
Enter fullscreen mode Exit fullscreen mode

One Thing I'm Proud Of: Invalid Combo Handling

Sklearn can raise errors for incompatible combinations — like penalty='l1' with solver='lbfgs'. GridSearchCV crashes on these. You have to manually filter them out.

EliminationSearchCV catches any exception during fit(), scores that combination 0.0, and lets the elimination logic handle it naturally. Invalid combos just die in Round 1. No special handling needed from you.


Benchmark Results

Tested across 5 models and 3 datasets (cv=2, elimination_rate=0.8, 10,000 samples):

Model Grid Speedup Accuracy diff
DecisionTree Full 152x -0.0008
RandomForest Full 36x -0.0002
GradientBoosting Full 35x -0.0194
KNeighbors Full 11x -0.0004
LogisticRegression Full 4x -0.0004

Full grids are where this shines. The accuracy trade-off is minimal — under 0.02 across all models, often zero.

Honest caveat: Light grids (small search spaces) are actually slower with this approach. The elimination overhead doesn't pay off when there are only a few combinations to begin with. If your grid is small, stick with GridSearchCV.


Architecture: How It's Built

The library is two files:

src/EliminationSearchCV/
├── EliminationSearchCV.py   ← Core class: fit(), elimination logic, scoring
└── Utils.py                 ← Stateless utilities: fold creation, combination generation, metrics
Enter fullscreen mode Exit fullscreen mode

The flow inside fit():

EliminationSearchCV.fit(X, y)
    │
    ├─▶ Utils.create_cv_data_sets()         — StratifiedKFold/KFold splits
    │
    └─▶ [For each round i = 1 … n_params]
             │
             ├─▶ generate_param_combinations_with_limit(grid, limit=i)
             │
             ├─▶ _score_candidates(candidates)
             │         — per-fold metric evaluation
             │
             └─▶ _eliminate_low_scoring_values(candidates, scores)
                       ├─▶ _eliminate_single_param_values()   — Round 1
                       └─▶ _eliminate_multi_param_values()    — Rounds 2+
Enter fullscreen mode Exit fullscreen mode

A key design decision: in Round 1, each parameter's values are scored and compared in isolation — so C values compete only against other C values, not against penalty values. This prevents interference between parameters that are on completely different scales.

In later rounds, all combinations are ranked globally and the top (1 - elimination_rate) fraction survives.


What's Working and What's Not Yet

Currently supported:

  • fit(), best_params_, best_score_, best_estimator_
  • Round 1: per-parameter isolation and elimination
  • Rounds 2+: global combination ranking
  • StratifiedKFold / KFold cross-validation
  • Invalid combination handling
  • Scoring: accuracy, precision, recall, f1, roc_auc

On the roadmap:

  • cv_results_ (per-fold score breakdown)
  • n_jobs parallel evaluation via joblib
  • verbose logging
  • Full pytest test suite
  • Scikit-learn BaseEstimator compatibility

Try It

pip install elimination-search-cv
Enter fullscreen mode Exit fullscreen mode

Requirements: Python ≥ 3.8. scikit-learn and numpy install automatically.

GitHub: https://github.com/thisal-d/elimination-search-cv


Honest Disclaimer

This is an experimental approach. The quality of results depends heavily on the dataset and model. I'm actively benchmarking it and the results so far are promising — but I wouldn't call it production-ready yet.

What I'd genuinely love is feedback on edge cases where it fails. If you try it on a grid where it gives clearly wrong results or behaves unexpectedly, please open an issue. That's more useful to me right now than praise.


If you found this interesting, a ⭐ on the repo helps a lot — it keeps the motivation alive to keep building.

Tags: python machinelearning datascience opensource

Top comments (0)