Running Parallel Jobs¶
Distributed memory¶
For distributed memory, each process has its own memory and does not share with any others. A distributed memory job can run across multiple Compute nodes. It requires a program that is written with the specific parallel directive, e.g. the Message Passing Interface (MPI). Moreover, it requires an additional set up to scatter the processes over Compute nodes. Suppose we want to run a job with 16 processes which spawn 4 processes on each compute node, we may write:
#!/bin/bash
#SBATCH -J distributed # Job name
#SBATCH -N 4 # Total number of nodes requested
#SBATCH -n 16 # Total number of mpi tasks
#SBATCH --ntasks-per-node=4 # Total number of tasks per one node
#SBATCH -t 120:00:00 # Run time (hh:mm:ss)
mpirun -np 16 -ppn 4 [ options ] <program> [ <args> ]