BYU

Office of Research Computing

How do I check the output from my job while the job is running?

You can do "checkjob -v $YOUR_JOB_ID" and look for the line following "Allocated Nodes:". The list below it contains the hostname in square brackets with a colon followed by the number of procs you were allocated on that node.

You can ssh into that node from a login node and look in the directory /usr/spool/PBS/spool. It will be /usr/spool/PBS/spool/$JOBID.fslsched.fsl.byu.edu.OU and .ER.

Example:

me@m6int01:~$ checkjob -v 12345
job 12345 (RM job '12345.fslsched.fsl.byu.edu')

AName: myjobname
State: Running
Creds:  user:userabcde  group:userabcde  account:user45678  class:batch  qos:normal
WallTime:   20:03:33 of 12:12:00:00
SubmitTime: Thu Jul  5 14:57:27
  (Time Queued  Total: 00:00:44  Eligible: 00:00:43)

StartTime: Thu Jul  5 14:58:11
Total Requested Tasks: 1

Req0?  TaskCount: 1  Partition: base
Memory >= 8192M  Disk >= 0  Swap >= 0
Dedicated Resources Per Task: PROCS: 1  MEM: 8192M
Utilized Resources Per Task:  PROCS: 0.82  MEM: 22G  SWAP: 25G
Avg Util Resources Per Task:  PROCS: 0.82
Max Util Resources Per Task:  PROCS: 0.83  MEM: 22G  SWAP: 25G
Average Utilized Memory: 22726.15 MB
Average Utilized Procs: 0.82
TasksPerNode: 1  NodeCount:  1

Allocated Nodes:
[m5-4-5:1]


Notification Events: JobStart,JobEnd,JobFail  Notification Address: email@example.com
Task Distribution: m5-4-5

Shell:          /bin/bash
IWD:            /some/path
UMask:          0000
OutputFile:     m6int02.fsl.byu.edu:/some/path.o12345
ErrorFile:      m6int02.fsl.byu.edu:/some/path.e12345
StartCount:     1
Partition List: [ALL]
SrcRM:          base  DstRM: base  DstRMJID: 12345.fslsched.fsl.byu.edu
Submit Args:    file.sh
Flags:          BACKFILL,PREEMPTOR,FSVIOLATION
Attr:           BACKFILL,FSVIOLATION,checkpoint
StartPriority:  19873
PE:             2.78
Reservation '12345' (-20:03:51 -> 11:15:56:09  Duration: 12:12:00:00)


me@m6int01:~$ ssh m5-4-5 me@m5-4-5:~$ cd /usr/spool/PBS/spool me@m5-4-5:/usr/spool/PBS/spool$ ls 12345.fslsched.fsl.byu.edu.ER 12345.fslsched.fsl.byu.edu.OU me@m5-4-5:/usr/spool/PBS/spool$ tail 12345.fslsched.fsl.byu.edu.OU some of the last lines of your really important job's output are now shown me@m5-4-5:/usr/spool/PBS/spool$

In this case the hostname is m5-4-5.