Why didn't the scheduler give me the resources I asked for?

On occasion, the scheduler software may adjust the exact layout of jobs, in an attempt to better utilize resources. This should only affect the specific node/processor layout. If you experience any other issues related to job resources not being what you expect, please let us know.

Overview

Sometimes, we see users who request resources like this:

#PBS -l nodes=48:ppn=1

This means that the job in question, is requesting 1 processor each, on 48 total nodes. There's nothing wrong with this in theory, but since we have a large number of users who request whole nodes, those users' jobs won't be able to coexist with this user's job.

As a result, we've allowed the scheduler to rewrite the request slightly. The total number of processors will be the same, but in the example above, it might decide to give you 4 nodes with 12 processors per node, or 8 nodes with 6 processors each, or even 16 nodes with 3 processors each. In general, it will try to pack you into the fewest nodes possible.

This is usually either an advantage, or at least not a problem. If your program does a lot of communicating between processors, then you may see a performance improvement, since communicating between processors on the same node, will be faster than communicating between nodes, even when those nodes have a special interconnect like Infiniband.

If your program doesn't depend on communication very much, then it will still run about the same as it did before.

But I really want that exact layout. How do I do that?

In a few very rare instances, users explicitly want to test the communication between nodes, and so want to force the job to use the exact layout of nodes and ppn that they specified. Since this is very rare we have made the default behavior to pack as described above. However, if you really want the exact layout you specified, you can add this syntax to your job script:

#PBS -W x=nmatchpolicy:exactnode

Alternatively, you can simply add it as a parameter to your job submission, so instead of this:

qsub myjobscript

You'd have something more like this:

qsub -W x=nmatchpolicy:exactnode myjobscript