DEV Community

Cover image for What Actually Happens When You Run sbatch in Slurm
Muhammad Zubair Bin Akbar
Muhammad Zubair Bin Akbar

Posted on

What Actually Happens When You Run sbatch in Slurm

If you work with HPC clusters, you likely use sbatch every day. You submit a script and expect it to run.

But that single command triggers a full workflow inside Slurm.

Understanding this internal flow helps you debug issues faster, optimize job performance, and better understand how your cluster behaves.

Step 1: Submitting the Job

When you run:

sbatch job.sh
Enter fullscreen mode Exit fullscreen mode

You are not starting the job. You are submitting a request to Slurm.

The script includes:

  • Resource requirements such as CPUs, memory, GPUs
  • Job metadata like name and output paths
  • The actual commands to execute

At this point, Slurm simply accepts the job.

Step 2: Communication with slurmctld

The sbatch command sends the job to the Slurm controller daemon, slurmctld.

This daemon:

  • Assigns a Job ID
  • Stores the job details
  • Marks the job as PENDING

Nothing is running yet.

Step 3: Job Enters the Queue

The job is now placed in the scheduling queue.

evaluates:

  • Job priority
  • Fairshare usage
  • Partition limits
  • Resource availability

This determines when your job will run.

Step 4: Scheduling Decision

The scheduler continuously checks:

  • Free nodes
  • Resource fragmentation
  • Backfill opportunities

If your job fits available resources, it gets selected. Otherwise, it stays pending.

Step 5: Resource Allocation

Once selected, Slurm:

  • Assigns specific compute nodes
  • Reserves CPUs, memory, and GPUs
  • Changes job state to RUNNING

Now your job has allocated resources.

Step 6: Node-Level Communication

Each compute node runs a daemon called slurmd.

The controller sends job details to these nodes. The nodes prepare the execution environment.

Step 7: Job Execution via slurmstepd

On the compute node, slurmstepd is launched.

This process:

  • Starts your application
  • Manages job steps
  • Handles output and error streams
  • Enforces resource limits using cgroups

Your script begins executing here.

Step 8: Monitoring During Execution

While the job runs:

  • Slurm tracks resource usage
  • Logs are written to output files
  • Accounting data is collected

You can monitor the job using:

squeue
scontrol show job <jobid>
Enter fullscreen mode Exit fullscreen mode

Step 9: Job Completion

When the job finishes:

  • slurmstepd exits
  • Resources are released
  • Temporary processes are cleaned up

The job state becomes COMPLETED, FAILED, TIMEOUT, or CANCELLED.

Step 10: Accounting and Logs

Finally:

  • Job statistics are stored
  • Output files remain available
  • Usage data is recorded

You can check this using:

sacct
Enter fullscreen mode Exit fullscreen mode

Full Flow Summary

  1. Submit job using sbatch
  2. slurmctld receives and queues it
  3. Scheduler evaluates priority
  4. Resources are allocated
  5. slurmd prepares nodes
  6. slurmstepd runs the job
  7. Job completes and resources are released

Common Misconceptions

“sbatch runs the job immediately”
It only submits the job.

“Pending means failure”
It usually means waiting for resources.

“Slurm just runs scripts”
It manages scheduling, allocation, execution, and cleanup.

Final Thought

sbatch may look simple, but it triggers a complete orchestration pipeline inside Slurm.

Once you understand this flow, debugging becomes easier, performance tuning improves, and cluster behavior becomes predictable.

Top comments (1)

Collapse
 
mournfulcord profile image
MournfulCord

Good breakdown. The handoff between slurmctld and slurmd is where most people get lost, so it’s nice seeing it laid out cleanly.