DEV Community

loading...
AWS MENA Community

Getting started with Julia on AWS SageMaker | AWS White Paper Summary

Salah Elhossiny
ML engineer || AWS Certified MLS || AWS Community Builders member || Fullstack developer
・4 min read

In this article we will explore how to get started combining these powerful tools and start an exciting journey using Julia as an alternative for data science and machine learning.

  • This guide will walk you through the following steps:

    • Creating an Amazon SageMaker notebook instance
    • Creating a Julia environment
    • Installing IJulia and running a Julia notebook
    • Testing the Julia notebook
  • This guide assumes that you have the following:

    • An AWS account with the capability to create new instances in the Amazon SageMaker console.
    • A basic understanding of machine learning or data processing.

Setting Up Your Environment

Creating an Amazon SageMaker Notebook Instance

  1. Sign in to Amazon SageMaker console, and open the Amazon SageMaker Dashboard.
  2. Select Notebook instances. Alternatively, on the left menu panel, select Notebook instances in the Notebook section.
  3. Select Create notebook instance.
  4. For Notebook instance name enter the name, and for the Notebook instance type select the instance type for your workload and budget.

in

  1. Select Open JupyterLab, to launch a console of your notebook instance

in

Creating a Julia Environment

Use the following procedure to create Julia environment in the JupyterLab console.

  1. In the JupyterLab console, navigate to the Launcher tab if visible, scroll to the bottom of the tab, and select Terminal.

in

  1. If the Launcher tab is not visible in the JupyterLab console, navigate to the menu bar, and select File, New, Terminal.

1

  1. A new tab, with a Linux shell console attached, will open.

1

  1. To create a new julia conda environment for Julia, and to switch into it, run the following command.
#!/bin/sh
source activate
conda create --yes -n julia
conda activate julia
Enter fullscreen mode Exit fullscreen mode
  1. To verify your current environment, run the following command.
#!/bin/sh
conda env list

Enter fullscreen mode Exit fullscreen mode

The expected result is shown below. An environment named julia is marked with an asterisk (*) to indicate that it is currently active.

1

  1. To install the Julia language package and to verify that installation is complete, run the following command.

#!/bin/sh
conda install --yes -c conda-forge Julia=1.0.3
julia -v

Enter fullscreen mode Exit fullscreen mode
  1. To verify that Julia is successfully installed, run the following command to launch a Julia REPL console.
#!/bin/sh

julia

Enter fullscreen mode Exit fullscreen mode

You now have successfully set up Julia REPL, and have the full capabilities and features of a Julia runtime environment.

1

Installing IJulia and running a Julia Notebook

Use the following procedure to install the Julia kernel (IJulia package) for Jupyter. This will make JupyterLab aware of Julia’s existence, and configure the necessary settings, such as new options for the Julia notebook and kernel.

  1. To install IJulia and activate the Julia kernel for JupyterLab, run the following command on the Julia REPL console.
using Pkg
Pkg.add("IJulia")
using IJulia
jupyterlab(detached=true)

Enter fullscreen mode Exit fullscreen mode
  1. To launch your first Julia notebook in Amazon SageMaker, on the Launcher tab, choose Julia 1.0.3.

1

  1. To test native Julia code, enter the following snippets into code cells on the newly created Julia notebook.
versioninfo()
(x,y) = x + y
(2, 3)
Enter fullscreen mode Exit fullscreen mode

1

Julia Notebook Examples

Use the following procedure to experiment further with how Julia enables you to visualize data.

  1. To install the packages Plots and DataFrames, enter the following in to the Julia notebook cells.

import Pkg
Pkg.add("Plots")
Pkg.add("DataFrames")
Enter fullscreen mode Exit fullscreen mode

You will employ Pkg, a built-in package manager, to install Plots and DataFrames, and any required dependent packages will be resolved and installed automatically by Pkg. You should see the following results.

1

  1. To generate a sequence of integers and a sequence of random float values, run the following command.
using Random
A = [1:1000...]
B = [randn(Float64) for i in A]
last(A)
Enter fullscreen mode Exit fullscreen mode
  1. To create a DataFrame object from the two previously created arrays, run the following command.
using DataFrames
df = DataFrame(X = A, Y = B)
sort!(df, [:X])
last(df, 5)
Enter fullscreen mode Exit fullscreen mode

A DataFrame (df ) object is created, where column X contains the values from array A, and column Y contains the values from array B. Then, object df is sorted by the values in column X, and the last five rows are reported as shown below.

1

  1. To plot the data as a scatterplot, run the following command.
using Plots

scatter(df.X, df.Y, w=3)
histogram(df.Y, bins=:scott, weights=repeat(1:5, outer=200))

Enter fullscreen mode Exit fullscreen mode

Deleting Your Instance

When your notebook instance is no longer in use, we recommend deleting the instance. Use the following procedure to stop and delete your notebook instance.

  1. Navigate to your Notebook instance list, and choose the instance that you want to delete.
  2. To stop your instance, select Actions menu, Stop menu. The process of stopping an instance may take several minutes. When it is completed, the status column in your Notebook instance list will show as Stopped.

  3. Select the instance that was stopped previously, and select Actions menu, then Delete menu.

References

Original White Paper

Discussion (0)