Gaussian

Gaussian is an application used for computational chemistry tasks. Unfortunately, it also seems to be very hard on storage systems, and has historically been able to drive centrally-managed storage services into the ground. On two separate occasions, the use of either home directories or scratch directories has caused storage traffic to spiral out of control, slowing down the storage system from the 25-50 MB/s that we commonly see, to speeds around 1-2 MB/s or less. This slows down all jobs on the system for all users.

Be sure that Gaussian only uses the local disk during its run because of these problems.

Please keep in mind the following points; note that the example script provided on this page does all of this:

  • If FSL administrators discover that Gaussian jobs are having a negative affect on the system as a whole, FSL admins will kill the jobs with little or no warning. We don't like to do this but it is sometimes necessary.
  • Users should write scripts to use the /tmp directory on the local hard drive - not your home directory, not scratch, not group directories. Best practice is to create a folder with the name of your jobid, then put all your scratch files in that folder. i.e. "mkdir -pv /tmp/$PBS_JOBID". An example script is provided below that does this.
  • The environment variable GAUSS_SCRDIR should be defined to reference the temporary directory. Gaussian uses this variable to know where to store temporary files during the job's run.
  • The output files should also be put in the local scratch directory, and copied back to home directories as appropriate at the end of the job script.
  • If the local /tmp directory has inadequate capacity for your scratch data needs, please let us know as soon as possible. A table with current /tmp capacities is listed below.
  • The output of jobs (the .eJOBID and .oJOBID files) is also spooled inside /tmp, so take that into account as you calculate your storage needs.
  • Remove the /tmp folders and scratch files your job creates when your job completes. The example script below does this.

Storage Space Limitations

Nodes Approximate space in /tmp
marylou5 30 GB
m6 100 GB
bigmem (Intel) 100 GB
bigmem (AMD) 200 GB

Example Job Script

#!/bin/bash
#PBS -l nodes=1:ppn=8,pmem=3GB,walltime=03:00:00
#PBS -N test_local
#PBS -l qos=test

#This script makes the following assumptions:
# - Any files that match $JOBSOURCE/$JOBNAME* should be copied to the local node's /tmp directory as input files -
If
# this isn't the case, modify the "cp" line just after the "Copying working data" output
#
# - The $JOBNAME.log file is put inside $TEMPORARY_DIR, and copied back at the end of the job script
#
# - If the job finishes successfully, only the $JOBNAME*.mo and $JOBNAME.log should be copied back to the original
# directory - If this isn't the case, look at the two "cp" lines just after the "Copying resulting data" output.
#
# - If the job is deleted using "qdel" or hits its walltime, all the data should be deleted. If this isn't the case,
# look at the NOTE in the "cleanup_scratch" signal handler

export TEMPORARY_DIR="/tmp/$PBS_JOBID"
export GAUSS_SCRDIR="$TEMPORARY_DIR/temporary_files"
export JOBNAME=test
export JOBSOURCE="$PBS_O_WORKDIR"

# set up function. this isn't called/run here. It's used if the job is canceled via a signal.
cleanup_scratch() {
    echo "Deleting inside signal handler, meaning I probably either hit the walltime, or deleted the job using qdel"

    ##NOTE: IF YOU WANT TO KEEP ANY OF THE FILES FROM $TEMPORARY_DIR WHEN THE JOB IS DELETED
    # BY qdel OR KILLED BECAUSE OF WALLTIME, USE A COMMAND LIKE THIS:
    #cp -v "$TEMPORARY_DIR/$JOBNAME"*.mo "$PBS_O_WORKDIR"

    cd "$PBS_O_WORKDIR"
    rm -rfv "$GAUSS_SCRDIR"
    rm -rfv "$TEMPORARY_DIR"
    echo "---"
    echo "Signal handler ending time:"
    date
    exit 0
}

# Associate the function "cleanup_scratch" with the TERM signal, which is usually how jobs get killed
trap 'cleanup_scratch' TERM

# basic diagnostic output
echo "---"
echo "Beginning-of-job Diagnostic information:"
echo "---"
echo "Nodes assigned:"
cat "$PBS_NODEFILE"
echo "---"
echo "Temporary Directory:"
echo "$TEMPORARY_DIR"
echo "---"
echo "Scratch Directory:"
echo "$GAUSS_SCRDIR"
echo "---"
echo "Job Source Directory:"
echo "$JOBSOURCE"
echo "---"
echo "Current Time:"
date
echo "---"

# create temporary directory
echo "Creating Temporary directory at $TEMPORARY_DIR"
mkdir -pv "$TEMPORARY_DIR" 2>&1
echo "---"

# create scratch directory
echo "Creating scratch directory at $GAUSS_SCRDIR"
mkdir -pv "$GAUSS_SCRDIR" 2>&1
echo "---"

# copy working data information from $JOBSOURCE/$JOBNAME* to $TEMPORARY_DIR
echo "Copying working data information from $JOBSOURCE/$JOBNAME* to $TEMPORARY_DIR"
cp -v "$JOBSOURCE/$JOBNAME"* "$TEMPORARY_DIR"
echo "---"

# changing directory to $TEMPORARY_DIR
echo "Changing directory to temporary dir at $TEMPORARY_DIR"
cd "$TEMPORARY_DIR"
echo "---"

echo "Starting Gaussian Run at:"
date

# the actual gaussian run starts here
/fslapps/chem/bin/rung03  "$JOBNAME.log"
/fslapps/chem/bin/rmipc

echo "---"
echo "Gaussian Run ended at:"
date
echo "---"

echo "Copying resulting data (\$JOBNAME*.mo and \$JOBNAME.log) to $JOBSOURCE"
cp -v "$TEMPORARY_DIR/$JOBNAME"*.mo "$JOBSOURCE"
cp -v "$TEMPORARY_DIR/$JOBNAME.log" "$JOBSOURCE"
echo "---"

echo "Changing directory back to submission directory at $PBS_O_WORKDIR"
cd "$PBS_O_WORKDIR"

# delete directory
echo "Deleting directories at end of script"
rm -rfv "$GAUSS_SCRDIR"
rm -rfv "$TEMPORARY_DIR"

echo "---"
echo "Job ending time:"
date
echo "---"