Batch job scheduling is implemented on the AMNH clusters through the resource manager, Torque, and job scheduler, Moab. The job scheduler uses a set of priorities to determine how a job is distributed across the machine nodes and the resource manager monitors all submitted jobs and all resources.
All jobs must be submitted to the scheduler for processing via the Portable Batch System (PBS) which is a networked subsystem used for controlling a workload of batch jobs. A job is represented by a shell script, which contains the PBS commands (begin with #) and shell commands needed to run the job. The script file is created using the editor of your choice or copying a file created on your local host.
Here is an example of a simple PBS job script.
myjob.sub
#!/bin/bash
# Write stdout and stderr to the files jobname.oJobId and jobname.eJobId, respectively. The JobId is a unique system number given to each job.
#PBS -N jobname
#PBS -j oe
#Specify the length of the time the job should run
#PBS -l walltime=00:00:00:00 #format DAY:HR:MIN:SEC
#Shell commands
echo "-----"
echo "Changing to $PBS_O_WORKDIR"
cd $PBS_O_WORKDIR
echo "-----"
echo "Running on the following hosts"
echo $PBS_NODEFILE
cat $PBS_NODEFILE
echo "-----"
#MPI command to run in parallel
mpirun /fullpath/executable input_data_file (if necessary)
To submit the job to queue issue the command
msub -l nodes=N:ppn=np myjob.sub
The submission command msub places the job onto the scheduler. The parameter -l is a lower case L, N is the number nodes and np is the maximum number of processors per node (ppn).
The Moab scheduler provides a utility to set polices for fair utilization of the available resources.
After creating your job script, use the msub command to submit it to the scheduler
msub -l nodes=N:ppn=np jobscript
The state of a job after submission can be:active, idle, blocked or deferred. The following table lists scheduler commands that are available to end-users.
| Command | Description |
|---|---|
| msub | Submit a job |
| canceljob |
Cancels existing job |
| checkjob | Displays job state, resource requirements, environment, constraints, history allocated resources |
| setres | Creates a user reservation |
| releaseres | Releases a user reservation |
| showbf | Shows resource availability for jobs with specific resource requirements |
| showstart | Displays detailed prioritized list of active and idle jobs |
| showq | Show estimated start time for idle jobs |
| showstats | Shows detailed usage statistics for users, groups and accounts. |