Preface
Amazon SageMaker is great for us data scientists and machine learning engineers for exploring data, building models. The range of preinstalled packages is great and enough for normal scenarios. But for having new and specific versions of packages "pip" or "conda" managers can be used.
What this blog will give you?
A template to bend and flex according to your requirements and get lifecycle configurations done fast. π
The Problem π§
The sole issue is every time I turn the notebook instance off, the libraries installed manually are lost. This becomes a routine every time you start instance. I would very much like to have my environment and libraries setup as I start instance.
Preferences π§βπ»
- I like to get my new environment started with fresh python installation using conda. Here's a one liner for that:
conda create -n <env_name> python=<python version>
conda create -n dev_env python=3.9
- Then we can activate the environment as:
conda activate <env_name>
conda activate dev_env
- After getting into environment I like to install packages using the pip manager.
pip install <package name>
pip install pyarrow
Let's put this in a lifecycle configuration shell script shall we! π
In the following script we will take care of following tasks:
- Creation of environment with desired version of python. In this case python 3.9
- Installation of required libraries with pip manager. In this case pandas, pyarrow and TensorFlow.
- Make newly created environment accessible for notebooks. Please read the comments in the script.
!/bin/bash
set -e
use ec2-user for operations
sudo -u ec2-user -i <<'EOF'
environment creation
conda create -n dev_env python=3.9 -y
source activate dev_env
library installation
pip install pandas
pip install pyarrow
pip install tensorflow==2.9
make new environment accessible
conda install ipykernel -y
source deactivate
Script placement πͺ
1. Locate lifecycle configurations in console
2. Create lifecycle configuration to run on notebook start
3. Add lifecycle configuration to notebook instance configurations
Results π‘
Kernel Selection!
See the kernel listed in the dropdown π
Couple of checks on terminal
And we have what we need!
There we have it πͺ, let me know if it helped anyone. If anyone knows better method please comment and let me know! This is only one example of endless applications of lifecycle configurations. You can access them here: SageMaker Lifecycle Configuration scripts
Top comments (0)