Muhammad Zubair Bin Akbar

Posted on Apr 14

Writing Your First Slurm Job Script

#webdev #beginners #programming #productivity

If you are new to High Performance Computing, one of the first things you will do is submit a job using Slurm.

At first, it can feel confusing. But once you understand the basics, it becomes very straightforward.

Let’s walk through how to write your first Slurm job script.

What is a Slurm Job Script

A Slurm job script is just a simple shell script that tells the scheduler:

What resources you need
How long your job will run
What command should be executed

Instead of running your program directly, you submit this script to Slurm, and it handles everything for you.

Basic Structure of a Job Script

A typical Slurm script looks like this:

#!/bin/bash
#SBATCH --job-name=test_job
#SBATCH --output=output.log
#SBATCH --error=error.log
#SBATCH --time=00:10:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=1G

echo "Hello from Slurm!"
hostname

Let’s break this down.

Understanding the SBATCH Directives

Lines starting with #SBATCH are instructions for Slurm.

Job Name

#SBATCH --job-name=test_job

This is just a label to identify your job.

Output and Error Files

#SBATCH --output=output.log
#SBATCH --error=error.log

output.log → stores normal output
error.log → stores errors

Very useful for debugging.

Time Limit

#SBATCH --time=00:10:00

This means your job can run for 10 minutes max.

If it exceeds this, Slurm will stop it.

Tasks and CPUs

#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1

ntasks → number of processes
cpus-per-task → CPU cores per process

For simple jobs, keep both as 1.

Memory

#SBATCH --mem=1G

This requests 1 GB of RAM.

If your job needs more and you don’t request it, it may fail.

What Goes Inside the Script

After the SBATCH lines, you add the commands you want to run:

echo "Hello from Slurm!"
hostname

In real use cases, this could be:

Running a Python script
Executing a simulation
Launching an MPI job

Example:

python my_script.py

Submitting the Job

Once your script is ready, save it as:

job.sh

Then submit it:

sbatch job.sh

You will get a job ID like:

Submitted batch job 12345

Checking Job Status

To see if your job is running:

squeue -u your_username

To get more details:

scontrol show job 12345

Viewing Output

After the job finishes:

cat output.log
cat error.log

This is where you check results or debug issues.

Common Beginner Mistakes

A few things that often go wrong:

Requesting too little memory → job fails
Setting very short time limits → job gets killed
Running heavy jobs on login node instead of using Slurm
Forgetting to check error logs

Final Thoughts

Writing your first Slurm job script might seem small, but it is the foundation of everything you do in HPC.

Once you understand this, you can:

Run bigger workloads
Scale across multiple nodes
Work with GPUs and parallel jobs

Start simple, test small, and build from there.

DEV Community