Job Scheduling

Batch job scheduling is implemented on the AMNH clusters through the resource manager, Torque, and job scheduler, Moab. The job scheduler uses a set of priorities to determine how a job is distributed across the machine nodes and the resource manager monitors all submitted jobs and all resources.

Job Submission File

All jobs must be submitted to the scheduler for processing via the Portable Batch System (PBS) which is a networked subsystem used for controlling a workload of batch jobs. A job is represented by a shell script, which contains the PBS commands (begin with #) and shell commands needed to run the job. The script file is created using the editor of your choice or copying a file created on your local host.

Here is an example of a simple PBS job script.


 

myjob.sub


#!/bin/bash 

 

# Write stdout and stderr to the files jobname.oJobId and jobname.eJobId, respectively. The JobId is a unique system number given to each job.

 

#PBS -N jobname 

#PBS -j oe 

#Specify the length of the time the job should run 

 

#PBS -l walltime=00:00:00:00                #format DAY:HR:MIN:SEC 

 

#Shell commands 

echo "-----" 

echo "Changing to $PBS_O_WORKDIR" 

cd $PBS_O_WORKDIR 

echo "-----" 

echo "Running on the following hosts" 

 

echo $PBS_NODEFILE 

cat $PBS_NODEFILE 

echo "-----" 

 

#MPI command to run in parallel 

 

mpirun /fullpath/executable input_data_file (if necessary)   

 


To submit the job to queue issue the command

  msub -l nodes=N:ppn=np myjob.sub

 

The submission command msub places the job onto the scheduler. The parameter -l is a lower case L, N is the number nodes and np is the maximum number of processors per node (ppn).

 

Fairshare Scheduling Policy

The Moab scheduler provides a utility to set polices for fair utilization of the available resources.

Scheduler End-User Commands

Monitoring Jobs

After creating your job script, use the msub command to submit it to the scheduler
msub -l nodes=N:ppn=np jobscript

The state of a job after submission can be:active, idle, blocked or deferred. The following table lists scheduler commands that are available to end-users.

  1. Active jobs are started
  2. Idle jobs are eligble to run, but resources are not available
  3. Blocked jobs are not being considered to run likely due to policy violations. These jobs will eventually be moved to the idle state.
  4. Deferred jobs have a batch hold due to a request for a type or amount of resources that do not exist on the system. These jobs will not run.
Moab End-User Commands
Command Description
msub Submit a job
canceljob jobID Cancels existing job
checkjob   Displays job state, resource requirements, environment, constraints, history allocated resources
 setres  Creates a user reservation
releaseres   Releases a user reservation
 showbf  Shows resource availability for jobs with specific resource requirements
showstart   Displays detailed prioritized list of active and idle jobs
showq   Show estimated start time for idle jobs
showstats   Shows detailed usage statistics for users, groups and accounts.