DEV Community

Patrick Titzler for IBM Developer

Posted on

Elyra 2.2: R support, updated CLI, and more

It's only been a couple weeks since the Elyra open source community published version 2.1, which introduced experimental support for Apache Airflow.

Version 2.2 delivers a couple of exciting enhancements that our growing user base had on their wishlist. In this blog post I'll summarize the highlights:

  • New R script editor
  • Improved pipeline editor with support for R scripts
  • Deployment on Kubeflow Notebook servers
  • Extended command line interface with support for pipeline execution

As always, you can find the complete list of features and bug fixes that made it into the release in the changelog.

Edit and run R scripts

The Elyra Script Editor was extended to support the R language. You can therefore now create, edit, and run R scripts in JupyterLab in addition to Python scripts.

Script editors for Python and R in the Launcher

By installing the optional Language Server Protocol support for R, you can take advantage of productivity features you are likely familiar with from other IDEs, such as code linting and auto-completion.

R script in editor shown with Language Server Protocol enabled

We realize that this editor can't compete with RStudio, but its start!

If you've used the Jupyter notebooks in an enterprise deployment, you are probably familiar with the Jupyter Enterprise Gateway (JEG). In fact, even if you are not familiar with it, you might have used it. What it does - in a nutshell - is to give you the ability to run notebooks in remote kernels, allowing for better resource allocation and usage.

Watson Studio is one example of a managed enterprise service that leverages the JEG.

Because it is extending JupyterLab functionality, Elyra can take advantage of the Enterprise Gateway as well. One little known feature of Elyra though is it's ability to also allow for remote execution without the need for JEG through it's support for Kubeflow Pipelines or Apache Airflow.

If you have access to a Kubeflow Pipelines or Apache Airflow deployment, you can run R scripts (just like Python scripts and Jupyter notebooks) in those deployments directly from the editor. This is especially useful for scripts that require resources that are not available (or not sufficiently available) in your local environment.

Submit an R script for processing in Kubeflow Pipelines

Run R scripts in pipelines

In the Visual Pipeline Editor you can now assemble pipelines from multiple R scripts, or mix R scripts with Jupyter notebooks and Python scripts, as necessary.

A pipeline comprising of a notebook, a Python script, and an R script

You can run these pipelines locally in JupyterLab or remotely on Kubeflow Pipelines or Apache Airflow.

Processed pipeline in Kubeflow Pipelines GUI

If you are new to Elyra pipelines, take a look at the tutorials. They guide you through the process of creating and running a pipeline in various environments.

Use Elyra to run Kubeflow-hosted notebooks

Elyra can be deployed locally or in remote environments.

A local deployment typically serves only a single user and is created by installing Elyra from PyPI, conda, source code, or pulling a ready-to-use container image.

Remote deployments, such as in a data center or the cloud, are typically used when support for many users is required.

A common approach is to deploy JupyterHub on Kubernetes and configure it for Elyra, like it is done in Open Data Hub on the Red Hat OpenShift Container platform.

If you already have Kubeflow deployed and don't want to provision a dedicated instance of JupyterHub to serve notebooks, we've got great news for you. We've recently started to publish custom Elyra container images on Docker Hub and quay.io that you can use to run JupyterLab with Elyra on Kubeflow Notebook Servers. All you need to do is specify the Elyra container image name and (version) tag when you configure a new notebook server and you are good to go!

Configuring an Elyra notebook server in Kubeflow

Extended command line interface

As an extension to JupyterLab, Elyra is primarily GUI driven. However, there are certain tasks that can also be completed using the elyra-metadata command line interface:

The Elyra command line interface was extended in version 2.2 to support running of pipelines in local and remote environments. Initially this capability is only exposed through the elyra-pipeline command line interface, but work is on the way to provide a unified interface.

Run pipelines locally

Specify the run command to execute the pipeline locally, passing the pipeline file name as parameter, like so:

$ elyra-pipeline run /path/to/hello-world.pipeline
Enter fullscreen mode Exit fullscreen mode

This feature is still under active development, e.g. to visualize execution progress.

Run pipelines remotely

Specify the submit command to run the pipeline on Kubeflow Pipelines or Apache Airflow, passing the pipeline file name and the runtime configuration name as parameters, like so:

$ elyra-metadata list runtimes
 Schema    Instance     Resource
 ------    --------     --------
 kfp       kfp_test_env  /.../runtimes/kfp_test_env.json 

$ elyra-pipeline submit --runtime-config kfp_test_env /path/to/hello-world.pipeline
  ...
Enter fullscreen mode Exit fullscreen mode

If the pipeline was successfully submitted for execution, the command returns a GUI link that you can use to monitor the progress and a link to the cloud storage where the pipeline run artifacts are stored.

Improved usability

If you've used previous releases of Elyra, you should notice quite a few usability improvements that we've made. There's no denying that the Elyra project has matured a lot, since it was started about a year ago.

Coming up next

We've just started work for our next releases. There's plenty of stuff brewing in our lab. If you'd like to get the inside scoop, check out our discussion forum, chat with us, or join the weekly community meeting.

Top comments (0)