Matlab

As of the existing installations of Matlab in /fslapps will no longer be able to check out licenses.

For information on how to use Matlab, please see the official Matlab Documentation or talk to others at the university that have used Matlab.

Access to the matlab software on FSL systems requires acknowledging an understanding and agreement to licensing terms. Please read the Licensing section.

Submitting a simple Matlab job on FSL systems

The following is a sample command to be placed in a job script and is intended for simple, single processor Matlab jobs. If you wish to use more processors, please see the next section, Matlab Parallel Computing Toolbox and Distributed Computing Server.

Place the following command in a job script:

module load matlab
matlab -nodisplay -nojvm -nosplash -r $YourMatLabFilename

where $YourMatLabFilename is the name of a Matlab file, excluding the Matlab extension, in the current directory. In this case, for example, Matlab would be looking for the file YourMatLabFilename.m in the current directory. Also, the options that precede the filename are very important, as they allow Matlab to run in batch mode. Once the job script is ready to go, it can simply be run with the sbatch command. For more information about creating a job script, see our SLURM Tools introduction video and our video on how to use the job script generator.

Matlab Computing Parallel Toolbox and Distributed Computing Server

In addition to Matlab itself, we also offer the Matlab Parallel Computing Toolbox and the Matlab Distributed Computing Server. The former is used to create individual Matlab programs that can run in parallel across multiple processors or to run multiple Matlab programs simultaneously. The latter allows programs created with Parallel Toolbox to span multiple compute nodes. Thus, Parallel Toolbox will allow you to create parallel programs, but by default they may only run on the processors of a single compute node. With the addition of Distributed Computing Server it is then possible to run the program on multiple compute nodes simultaneously.

How to use Matlab Parallel Computing Toolbox and Distributed Computing Server

There are 2 types of jobs that can be run in the Matlab parallel environment:

Distributed jobs
Consists of multiple tasks that are executed simultaneously where each task is a separate program. There is no communication between the tasks and they are not dependent upon each other. This is also known as "embarrasingly parallel" computation.
Parallel jobs
Consists of one single program that is broken up into multiple dependent tasks that communicate with one another.

In general, distributed jobs require much less effort than parallel jobs, as parallel jobs require you to actually rework your Matlab code. For information on how to program distributed jobs, see this page in the Matlab documentation. For parallel jobs, go here. As you go through the documentation, please ignore any information about how to use the scheduler, as we have our own FSL specific setup. We will talk specifically about how to setup scheduler in the next paragraph. If you do not wish to use the scheduler (the scheduler is only required to do computation on multiple compute nodes) and wish only to use the Parallel Computing Toolbox on a single compute node, please see this page and pay specific attention to how the matlabpool function is used. This will help you to get the results you want on a single compute node.

If you intend to use the scheduler and the Distributed Computing Server, there are a few basic steps to follow:

  1. There are 2 optional variables that you can set that correspond to job job parameters:
    walltime
    This tells PBS how long you expect the job to run. If it is not set, 1 hour is assumed.
    ppn
    The number of processors per node. If this is not set, 2 processors per node is assumed. Furthermore, ppn is irrelevant in the context of distributed jobs
  2. Call the FSLfindResource script. This is done by simply inserting the line FSLfindResource into your code. Furthermore, when calling any other functions on the scheduler object, remember that the associated identifier is sched.

For more information about parallel programming in Parallel Computing Toolbox and using it with Distributed Computing Server, see the main page of the Parallel Computing Toolbox Documentation.

Examples

The following examples programs are split into 2 parts. The first is a submission program that simply submits your job. It will not wait for your program to finish, but will submit the job and exit. This way you can continue to run programs in Matlab without having to wait for the job to complete. Once you know the job is finished (If you specified an email address you will receive an email that it has completed) you can run the second part of the program, which retrieves the output from the job.

Here is an example of a distributed program that runs on multiple compute nodes:

Actual Matlab program (myRand.m)

function out = myRand(in1,in2)
%% myRand is a wrapper around rand.  
% myRand is simply being used so we have custom code to send to the cluster
% that is needed to finish the tasks.

out = rand(in1,in2);

Submission

clc
clear all

% Information specific to your job
% PUT YOUR OWN EMAIL ADDRESS HERE
email = 'youremail@example.com'
walltime = '00:02:00'

% Set up Matlab to interface with the BYU supercomputer's scheduler
FSLfindResource

% Create the job
myjob=createJob(sched)
myjob_id=myjob.ID

% Save the job for later retrieval
save myjob myjob_id

% Tell Matlab which files are needed for the computation
set(myjob, 'FileDependencies', {'myRand.m'})

% Create the task
t=createTask(myjob, @myRand, 1, {{3,3} {3,3} {3,3} {3,3} {3,3}});

% Submit the job
submit(myjob)
disp('done submitting ...')

Retrieval

clc
clear

% load job ID and find the job
sched = findResource('scheduler','type','generic')
load myjob
myjob = findJob(sched, 'ID', myjob_id)

% get the output
results=getAllOutputArguments(myjob)
results{1:5}

Here is an example program that runs in parallel mode in multiple compute nodes:

Actual Matlab parallel program (colsum.m)

function total_sum = colsum
if labindex == 1
    % Send magic square to other labs
    A = labBroadcast(1,magic(numlabs))
else
    % Receive broadcast on other labs
    A = labBroadcast(1)
end

% Calculate sum of column identified by labindex for this lab
column_sum = sum(A(:,labindex))

% Calculate total sum by combining column sum from all labs
total_sum = gplus(column_sum)

Submission

clc
clear all

% Information specific to your job
% PUT YOUR OWN EMAIL ADDRESS HERE
email = 'youremail@example.com'
walltime = '00:02:00'
ppn = 3

% Set up Matlab to interface with the BYU supercomputer's scheduler
FSLfindResource

% Create the job
myjob=createParallelJob(sched)
myjob_id=myjob.ID

% Save the job for later retrieval
save myjob myjob_id

% Tell Matlab which files are needed for the computation
set(myjob, 'FileDependencies', {'colsum.m'})

% Set maximum # of workers
set(myjob, 'MaximumNumberOfWorkers',4)

% Set minimum # of workers
set(myjob, 'MinimumNumberOfWorkers',4)

% Create the task
t=createTask(myjob, @colsum, 1, {})

% Submit the job
submit(myjob)
disp('done submitting ...')

Retrieval

clc
clear

% load job ID and find the job
sched = findResource('scheduler','type','generic')
load myjob
myjob = findJob(sched, 'ID', myjob_id)

% get the output
results=getAllOutputArguments(myjob);
results{1:4}

Licensing

Matlab, like most commercial programs is subject to licensing. You are on your honor to follow the licensing terms and policies between BYU and The MathworksTM. Summary terms of allowed instances per user are:

  • For standard Matlab without the Parallel Computing Toolbox or Distributed Computing Server, each individual user may run up to 2 instances of Matlab at any given time, e.g. one instance on the supercomputer and one instance on your desktop, or 2 instances on the supercomputer.
  • When using Parallel Computing Toolbox, each user may run up to 2 instances of Matlab with up to 8 tasks per instance. If you wish to use more than this, you must utilize the Distributed Computing Server
  • When using the Distributed Computing Server (DCS) users may run multiple instances of matlab each using multiple tasks, provided the task licenses are available from the licensing server at the time of execution. There are 96 of these licenses available in total, where each parallel or distributed task requires one license. These 96 licenses are shared by all users.

To help ensure users are aware of these licensing terms, FSL requires an acknowledgement and acceptance of them in order to access the matlab software on FSL resources.

Activate Matlab on my FSL user account.