DEV Community

Pranav Swaroop Gundla
Pranav Swaroop Gundla

Posted on

Deploying the R Studio on HPCs

This guide provides a streamlined approach for deploying RStudio Server on HPC environments with resource-rich configurations, enabling researchers to leverage computational power for data-intensive R workflows.

Overview

The R studio has to be build with the help of docker or apptainer to extract the image
You can use this link for r-base image and use the appropriate version which can suit your better working-case.
For Docker you use this

docker pull rocker/r-base
Enter fullscreen mode Exit fullscreen mode

For Apptainer (Formerly Singularity) you can use the following Recipe for the image creation for R-base and R-studio server.

Bootstrap: docker
From: rocker/rstudio:4.0.0

%post
    apt-get update && apt-get install -y \
        sudo \
        locales \
        pwgen \
        gdebi-core \
        && apt-get clean

    # Set locale
    echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen
    locale-gen en_US.UTF-8
    export LANG=en_US.UTF-8
    export LC_ALL=en_US.UTF-8

    # Create rstudio user
    useradd -m rstudio
    echo "rstudio:rstudio" | chpasswd
    adduser rstudio sudo

    # Add default R profile if needed
    echo "options(repos = c(CRAN = 'https://cloud.r-project.org'))" >> /home/rstudio/.Rprofile

%environment
    export USER=rstudio
    export PASSWORD=rstudio
    export LANG=en_US.UTF-8
    export LC_ALL=en_US.UTF-8

%runscript
    echo "Starting RStudio Server..."
    exec /init

%labels
    Maintainer your_username
    Version 4.0.0

%startscript
    exec /init
Enter fullscreen mode Exit fullscreen mode

Build It on HPC (Interactive Node)

singularity build rocker_rstudio.sif rocker_rstudio.def
Enter fullscreen mode Exit fullscreen mode

You need to be on a node with network access and Singularity (or Apptainer) installed.
This SLURM script sets up an RStudio Server instance using Singularity containers, automatically configuring SSH tunneling for secure remote access to your HPC-hosted RStudio environment.

SLURM Job Script

#!/bin/sh
#SBATCH --time=08:00:00
#SBATCH --partition=partition_name
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --mem=49152
#SBATCH --output=/home/%u/rstudio-server.job.%j

# Set your project directory containing the Singularity container
export PROJECT_DIR=/path/to/your/project/containers

echo "Current hostname: ${HOSTNAME}"

# Dynamically allocate an unused port to avoid conflicts
# Reference: https://unix.stackexchange.com/a/132524
readonly PORT=$(python3 -c 'import socket; s=socket.socket(); s.bind(("", 0)); print(s.getsockname()[1]); s.close()')

# Display connection instructions
cat 1>&2 <<END
=================================================================
RStudio Server Setup Complete
=================================================================

1. SSH tunnel from your local workstation:
   ssh -N -L 8787:${HOSTNAME}:${PORT} your_username@your_hpc_login_node

2. Open your web browser and navigate to:
   http://localhost:8787

3. When finished, terminate the job:
   - Exit RStudio Session (power button in top-right corner)
   - Run: scancel -f ${SLURM_JOB_ID}

=================================================================
END

# Launch RStudio Server with Singularity
singularity exec \
    --bind $PROJECT_DIR/etc:/etc/rstudio \
    --bind $PROJECT_DIR/tmpfs/lib:/var/lib/rstudio-server \
    --bind $PROJECT_DIR/tmpfs/run:/var/run/rstudio-server \
    --bind /etc/resolv.conf:/etc/resolv.conf \
    --bind /etc/ssl/certs:/etc/ssl/certs \
    $PROJECT_DIR/r_latest.sif \
    rserver --www-port ${PORT} --auth-none 1 --auth-pam-helper rstudio_auth

printf 'RStudio Server session terminated\n' 1>&2
Enter fullscreen mode Exit fullscreen mode

Setup Requirements

1. Directory Structure

Ensure your project directory contains the following structure:

/path/to/your/project/containers/
├── r_latest.sif          # R Singularity image
├── etc/                  # RStudio configuration files
├── tmpfs/
│   ├── lib/             # RStudio server library files
│   └── run/             # Runtime files
Enter fullscreen mode Exit fullscreen mode

2. Singularity Container

  • Use an R Singularity image (.sif file) with RStudio Server pre-installed
  • Ensure the image includes your required R packages and dependencies

Usage Instructions

Deployment

  1. Submit the job:
   sbatch rstudio-server.sh
Enter fullscreen mode Exit fullscreen mode
  1. Monitor job status:
   squeue -u $USER
Enter fullscreen mode Exit fullscreen mode
  1. Check output for connection details:
   cat ~/rstudio-server.job.[JOB_ID]
Enter fullscreen mode Exit fullscreen mode

Access

  1. Create SSH tunnel (replace placeholders with your values):
   ssh -N -L 8787:compute_node:assigned_port username@hpc_login_node
Enter fullscreen mode Exit fullscreen mode
  1. Open browser and navigate to http://localhost:8787

Cleanup

  • Exit RStudio session properly using the power button
  • Cancel the SLURM job: scancel -f [JOB_ID]

Resource Configuration

The script allocates:

  • Time limit: 8 hours
  • CPUs: 8 cores
  • Memory: 48 GB (49152 MB)
  • Tasks: 1

Adjust these parameters based on your computational requirements and cluster policies.

Benefits

  • Scalable: Leverage HPC computational resources for R workflows
  • Secure: SSH tunneling ensures encrypted remote access
  • Isolated: Containerized environment prevents conflicts
  • Flexible: Dynamic port allocation avoids scheduling conflicts

Troubleshooting

  • Port conflicts: The script automatically selects available ports
  • Container issues: Verify Singularity image path and permissions
  • SSH tunnel: Ensure proper firewall and network configuration
  • Job limits: Check cluster policies for resource constraints

Adapted from the rstudio-slurm project

Top comments (0)