DEV Community

Harish Bennalli
Harish Bennalli

Posted on

Amazon SageMaker Autopilot

What is Amazon SageMaker Autopilot?

Amazon SageMaker Autopilot automatically builds, trains, and tunes the best machine learning models based on your data, while allowing you to maintain full control and visibility.

Amazon SageMaker Autopilot eliminates the heavy lifting of building ML models, and helps you automatically build, train, and tune the best ML model based on your data. With SageMaker Autopilot, you simply provide a tabular dataset and select the target column to predict, which can be a number (such as a house price, called regression), or a category (such as spam/not spam, called classification). SageMaker Autopilot will automatically explore different solutions to find the best model. You then can directly deploy the model to production with just one click, or iterate on the recommended solutions with Amazon SageMaker Studio to further improve the model quality.


Automatic data pre-processing and feature engineering

You can use Amazon SageMaker Autopilot even when you have missing data. SageMaker Autopilot automatically fills in the missing data, provides statistical insights about columns in your dataset, and automatically extracts information from non-numeric columns, such as date and time information from timestamps.

Automatic ML model selection

Amazon SageMaker Autopilot automatically infers the type of predictions that best suit your data, such as binary classification, multi-class classification, or regression. SageMaker Autopilot then explores high-performing algorithms such as gradient boosting decision tree, feedforward deep neural networks, and logistic regression, and trains and optimizes hundreds of models based on these algorithms to find the model that best fits your data.

Model leaderboard

Amazon SageMaker Autopilot allows you to review all the ML models that are automatically generated for your data. You can view the list of models, ranked by metrics such as accuracy, precision, recall, and area under the curve (AUC), review model details such as the impact of features on predictions, and deploy the model that is best suited to your use case.

Automatic notebook creation

You can automatically generate a Amazon SageMaker Studio Notebook for any model Amazon SageMaker Autopilot creates and dive into the details of how it was created, refine it as desired, and recreate it from the notebook at any point in the future.

Feature importance

Amazon SageMaker Autopilot provides an explainability report, generated by Amazon SageMaker Clarify, that makes it easier for you to understand and explain how models created with SageMaker Autopilot make predictions. You can also identify how each attribute in your training data contributes to the predicted result as a percentage. The higher the percentage, the more strongly that feature impacts your model’s predictions.

Easy integration with your applications

You can use the Amazon SageMaker Autopilot application programming interface (API) to easily create models and make inferences right from your applications, such as your data analytics and data warehousing tools.

How it works

  1. Load tabular data from Amazon S3 to train the model
  2. Select target column for prediction
  3. The correct algorithm is chosen, training and tuning is done automatically for the right model
  4. Full visibility and control provided with model notebooks
  5. Select the best model for your needs from a ranked list of recommendations
  6. Deploy and monitor the model

We can choose from model notebooks to optimize and retrain the models to improve the quality.

Use Cases

Price predictions
Amazon SageMaker Autopilot can predict future prices to help you make sound investment decisions based on your historical data such as demand, seasonal trends, and price of other commodities.

Churn prediction
Customer churn is the loss of customers or clients, and every company looks for ways to eliminate it. Models automatically generated by Amazon SageMaker Autopilot help you understand churn patterns. Churn prediction models work by first learning patterns in your existing data and identifying patterns in new datasets so you can get a prediction about customers mostly likely to churn.

Risk assessment
Risk assessment requires identifying and analyzing potential events that may negatively impact individuals, assets, and your company. Models automatically generated by Amazon SageMaker Autopilot predict risks as new events unfold. Risk assessment models are trained using your existing datasets so you can get optimized predictions for your business.

Hope you liked this post!

More on ML updates in the next posts!

Discussion (0)