Science Computing Cluster

The Clusters

Enyo Cluster

Enyo Cluster

The AMNH scientific computing facility consists of several high-performance computing platforms: two Intel XEON multi-node Linux clusters, one AMD multi-core server and a six node special-purpose GRavity PipelinE (GRAPE) machines. The clusters are used primarily to run scientific applications extending phylogenetic analysis in computational biology to large-scale astrophysics simulations.

General Specifications:

Getting Started

Accessing the AMNH Cluster

Secure Shell Login

The computing clusters are accessible through a private network using secure shell login (ssh) for all authorized users. We use a Virtual Private Network (VPN) or gateway access to allow remote usage of the clusters. Once a secure connection is established users may login to the clusters from a local host using ssh. The term cluster-host refers to the name of the machin you wish to access. The ssh command is:

$ ssh yourusername@cluster-hostname

You will be prompted to issue your password for the cluster-host

$ request password
enter your password

Now you are logged onto the cluster-host. Issue the command, pwd, which returns the name of the current working directory. You should be in /home/yourusername.

Use one of the following methods for ssh login from your local host machine:

  • A terminal window which is a Unix command-line interface
  • An open source graphical frontend application which allows drag-n-drop functionality; available for download

File Transfer

Command-line File Transfer

Transfering files to the cluster-host is done using the secure copy (scp) command issued from the local host. To copy a file to the cluster machine

$ scp filename username@cluster-host.pcc.amnh.org:/home/username

A copy of the file with the same name will be placed in the user's specified directory on the cluster machine unless a new filename is specified.

To retrieve a file from the cluster-host to your local host

$ scp username@cluster-host.pcc.amnh.org:/home/username/filename .

A copy of the file with the same name will be placed on the local machine unless a new filename is specified. The period (.) at the end of the last command means ‘put the file here’, i.e. in the working directory on the local host.

GUI File Transfer

If you have a graphical interface, you can use the drag-in-drop feature to transfer files between the local host and the cluster. Click here to see an example of a Fugu interface

Job Scheduling

Batch job scheduling is implemented on the AMNH clusters through the resource manager, Torque, and job scheduler, Moab. The job scheduler uses a set of priorities to determine how a job is distributed across the machine nodes and the resource manager monitors all submitted jobs and all resources.

Job Submission File

All jobs must be submitted to the scheduler for processing via the Portable Batch System (PBS) which is a networked subsystem used for controlling a workload of batch jobs. A job is represented by a shell script, which contains the PBS commands (begin with #) and shell commands needed to run the job. The script file is created using the editor of your choice or copying a file created on your local host.

Here is an example of a simple PBS job script.


 

myjob.sub


#!/bin/bash 

 

# Write stdout and stderr to the files jobname.oJobId and jobname.eJobId, respectively. The JobId is a unique system number given to each job.

 

#PBS -N jobname 

#PBS -j oe 

#Specify the length of the time the job should run 

 

#PBS -l walltime=00:00:00:00                #format DAY:HR:MIN:SEC 

 

#Shell commands 

echo "-----" 

echo "Changing to $PBS_O_WORKDIR" 

cd $PBS_O_WORKDIR 

echo "-----" 

echo "Running on the following hosts" 

 

echo $PBS_NODEFILE 

cat $PBS_NODEFILE 

echo "-----" 

 

#MPI command to run in parallel 

 

mpirun /fullpath/executable input_data_file (if necessary)   

 


To submit the job to queue issue the command

  msub -l nodes=N:ppn=np myjob.sub

 

The submission command msub places the job onto the scheduler. The parameter -l is a lower case L, N is the number nodes and np is the maximum number of processors per node (ppn).

 

Fairshare Scheduling Policy

The Moab scheduler provides a utility to set polices for fair utilization of the available resources.

Scheduler End-User Commands

Monitoring Jobs

After creating your job script, use the msub command to submit it to the scheduler
msub -l nodes=N:ppn=np jobscript

The state of a job after submission can be:active, idle, blocked or deferred. The following table lists scheduler commands that are available to end-users.

  1. Active jobs are started
  2. Idle jobs are eligble to run, but resources are not available
  3. Blocked jobs are not being considered to run likely due to policy violations. These jobs will eventually be moved to the idle state.
  4. Deferred jobs have a batch hold due to a request for a type or amount of resources that do not exist on the system. These jobs will not run.
Moab End-User Commands
Command Description
msub Submit a job
canceljob jobID Cancels existing job
checkjob   Displays job state, resource requirements, environment, constraints, history allocated resources
 setres  Creates a user reservation
releaseres   Releases a user reservation
 showbf  Shows resource availability for jobs with specific resource requirements
showstart   Displays detailed prioritized list of active and idle jobs
showq   Show estimated start time for idle jobs
showstats   Shows detailed usage statistics for users, groups and accounts.

Use Policies

 

General User Policies

This document describes the general user policies for the AMNH Science Computing Facility. These policies are established to ensure the fair use and sustainability of this shared resource. If you have any questions or remarks about the policies described herein, please contact the Manager of Scientific Computing .

Use of the computing clusters (CC) signifies compliance with and acceptance of the general user policies as described. Read more...

General User Policies

This document describes the general user policies for the High-performance Computing facility (HPC) at the American Museum of Natural History. These policies are established to ensure the fair use and sustainability of this shared resource. If you have any questions or remarks about the policies described herein, please contact the Manager of Scientific Computing .

Use of the HPC signifies compliance with and acceptance of general user policies contained on this web site.

Communication

HPC users receive regular notices on system status through the HPC "notice" mailing list. Machine specific information is also posted in the "Message of the Day" viewable upon login.

Requests for help in HPC related matters should be sent to support.

Security

HPC accounts are created for single users and users are not allowed to share their accounts or passwords.
Failure to maintain the confidentiality of your password can result in the following actions:

  • The offending user's account will be suspended and the user will be forced to change his/her password. The account will be reinstated after the approval of the HPC administrator.
  • After a second incident, the user account will be suspended and the user will be forced to change his/her password. The account will be reinstated only after the approval of the HPC advisory panel chair.
  • If more than two incidents occur, the HPC advisory panel will determine whether account privileges on HPC will be permanently terminated.

System Abuse

The HPC is a shared system; therefore, your actions can impact the work of other users. The HPC is not comparable to a desktop machine; i.e., practices that are acceptable on your desktop can have serious impacts on the HPC system and affect many other users. The following is list of general guidelines that every HPC user should follow.

  • The MOAB/Torque scheduler and resource manager is used to control the scheduling of batch jobs on the system. A description of user-commands is found here.
  • Do not over-allocate parallel jobs. Make certain that your job uses the number of processors that you have requested or reserved.
  • Do not run highly experimental code that might compromise the usability of the network fabric, any of the compute nodes, or the shared storage system.

The HPC staff may take the following action immediately if an abuse of the system is identified.

  • Offending jobs will be suspended and guidance provided to resolve the misuse issue.
  • The system administrator will conduct a review of a user who misuses the system two or more times after being warned. Offending user accounts may be suspended following the review.

Maintenance

Periodically the clusters is taken off line and shut down in order to perform scheduled maintenance. Users will be notified with sufficient lead time to plan for any interruptions.

User Reservation

Users may request system time for high priority projects. All request for access to a dedicated time slot must meet the following criteria.

  • Person submitting the request must have an HPC account.
  • Justify the request
  • Demonstrate readiness within two weeks of the dedicated run period.
  • Runs must complete within 48 hours
  • Runs must use one half or more of all HPC processor cores.

Policy Disputes

Appeals concerning the general user policy described herein should be made to the HPC Manager.

Hardware

 

 Scientific Computing Cluster Machines

Name Network Address Nodes/Processors Memory Purpose
Demeter demeter.pcc.amnh.org 128/256 2GB RAM/core Production
Enyo enyo.pcc.amnh.org 32/128 8GB RAM/core Production
Eve eve.pcc.amnh.org 8-way server 128GB RAM Development

 

Cluster Account Request

Please use this form to request an account on the AMNH cluster. Please indicate any affiliation with the Museum in the comments section. All requests are subject to review.

Name of the specific cluster on which an account is requested
CAPTCHA
This test is for determining whether you are a human visitor, to prevent automated spam submissions.

Applications

Bioinformatics

Research in Comparative Biology and Genomics use several bioinformatics software applications from the public-domain and home-grown applications.

  • POY (Phylogenetic Analysis of DNA and other Data using Dynamic Homology)-estimates phylogenies and alignments of using maximum parsimony or maximum likelihood optimality criteria.
  • Malign- is a multiple alignment procedures that operates on close parallels to those used to search for minimum length cladograms.
  • RaxML- is a program for sequential and parallel Maximum Likelihood based inference of large phylogenetic trees. RAxML offers two ways to exploit parallelism: fine-grained parallelism that can be exploited on shared memory machines or multi-core architectures and coarse-grained parallelism that can be exploited on Linux clusters.
  • MrBayes - is a free software program that employs Bayesian estimation of phylogeny. Click here for more reference information.

Astrophysics-MHD

List of the main numerical simulation applications and visualization tools available on the clusters./p>

  • The Pencil-Code- is a high-order finite-difference code for compressible hydrodynamic flows with magnetic fields.Click here for more information.
  • The Rames Code (Adaptive Mesh Refinement) method used for general purpose simulations in self-gravitating fluid dynamics. The code is free software written in Fortran 90 with extensive use of the MPI library.Click here for more information.
  • IDL - data analysis, data visualization. This is a licensed software package.

HPC Training

The Cluster Computing Training Series

provides new and existing users with training and support in running science applications on the computer cluster facilities. A series of teaching and training sessions are offered quarterly on topics at all levels from basic cluster computing usage to applications and programming in parallel computing.

The training sessions are designed to teach users about the multicore computer cluster architecture, teach and demonstrate tools to manage computational simulation-driven applications, and to keep users abreast of new applications and tools in computational science.

Introductory topics will cover accessing the cluster machines, exploring the Linux environment, batch processing and system tools.

  • Unix (Linux) will teach the basics of the Unix (Linux) environment for unfamiliar users.
  • Batch Processing will provide an overview of running serial and parallel applications using the Moab/Torque resource manager and scheduler.
  • Compilers, Makefiles & Debugging will provide descriptions of various compilers, explain and demonstrate the use of Makefiles and how to use the system debuggers.
  • Basic Scripting will cover the Linux shell environment and writing scripts.

Advanced topics may include parallel processing (programming), optimization and performance.

After participating in the workshop series, users can participate in a survey that will provide information that will be used to improve the workshop and design future training sessions. Click here to take the survey.