Design Patterns for Resilient Serving - Continued Model Evaluation

#machinelearning #datascience

This design pattern helps detect and take action when a deployed model is no longer fit-for-purpose.

Reasons for model degradation

Concept drift
Data drift

Concept drift	Data drift
Relationship between the model inputs and target have changed	Any change that has occurred to the data being fed to the model for predictions as compared to the data that was used for training.

Identifying model deterioration

Continuous monitoring of the model's predictive performance over time
Assess this performance with the same evaluation metrics used during development

Continuous model evaluation provides a framework to evaluate a deployed mode's performance exclusively on new data. This means we can detect a model's staleness as early as possible. This information helps to do either of the following:

Retrain the model
Replace the existing model with a new version entirely.

This is done by capturing the following:

ground truth
prediction inputs and outputs for comparing with ground truth values
model versions
timestamp of prediction requests

Triggers for retraining

Whether to retrain or not based on an evaluation report depends on what amount of deterioration of performance is acceptable in relation to the cost of retraining.

Setting a higher threshold for model performance ensures a higher-quality model in production but will require more frequent retraining jobs which is costly.

Having a lower threshold on the other hand, would mean more cost effective but there is a chance of having a stale model in production.

Scheduled retraining

Continuous evaluation	Scheduled retraining
May happen everyday	May occur only every week or every month

DEV Community

Design Patterns for Resilient Serving - Continued Model Evaluation

Reasons for model degradation

Identifying model deterioration

Triggers for retraining

Scheduled retraining

Top comments (0)

Read next

New AI Revolution: Designing a Global Multi-Agent Network with Large Language Models

How to Use PySpark for Machine Learning

LLMs Know More Than They Show: Intrinsic Representation of Hallucinations Revealed

Unearthing Universal Feature Geometries: Sparse Autoencoders Reveal Crystal-like and Modular Structures