loading...
AWS

Using a SageMaker XGBoost model in scikit-learn

juliensimon profile image Julien Simon Originally published at Medium on ・2 min read

This is a quick post answering a question I get a lot: “how can I use in scikit-learn an XGBoost model that I trained on SageMaker?”.

Here it goes. Once you’ve trained your XGBoost model in SageMaker (examples here), grab the training job name and the location of the model artifact.

I’m using the CLI here, but you can of course use any of the AWS language SDKs.

$ export TRAINING\_JOB\_NAME='xgboost-190511-0830-010-14f41137'

$ export MODEL\_ARTIFACT=`aws sagemaker describe-training-job \
--training-job-name $TRAINING\_JOB\_NAME \
--query ModelArtifacts.S3ModelArtifacts \
--output text

$ echo $MODEL\_ARTIFACT
s3://sagemaker-eu-west-1-613904931467/sagemaker/DEMO-hpo-xgboost-dm/output/xgboost-190511-0830-010-14f41137/output/model.tar.gz

Then, download the artifact and extract the model.

$ aws s3 cp $MODEL\_ARTIFACT .

$ tar xvfz model.tar.gz
x xgboost-model

The model is a pickled Python object, so let’s now switch to Python and load the model.

$ python3
>>> import sklearn, pickle
>>> model = pickle.load(open("xgboost-model", "rb"))
>>> type(model)
<class 'xgboost.core.Booster'>

You’re done. From now on, you can use the model as if you’d trained it locally. For example, you can dump it and visualize it.

>>> model.dump\_model('model.txt')
>>> exit()

$ head model.txt
booster[0]:
0:[f2<512] yes=1,no=2,missing=1
 1:[f1<3.5] yes=3,no=4,missing=3
  3:[f2<1.5] yes=7,no=8,missing=7
   7:[f42<0.5] yes=15,no=16,missing=15
    15:leaf=0.508301735
    16:leaf=1.51004589
   8:leaf=1.72906268
  4:[f52<0.5] yes=9,no=10,missing=9
   9:leaf=1.39554036

See? That was super easy :)

Thanks for reading. Happy to answer questions here or on Twitter.

Posted on by:

juliensimon profile

Julien Simon

@juliensimon

Global Evangelist, AI & Machine Learning, Amazon Web Services

AWS

Are you a developer, architect, or community member interested in the cloud? The AWS Developer Relations team loves teaching about AWS and programming for customers of any background

Discussion

markdown guide