Environment Modules

This page refers to environment modules on our new operating system; information on the old TCL environment modules can be found here.

The Module System

The Lua environment module system (or Lmod) allows one to easily "load" and "unload" specific pieces of software. For example, to gain the use of Intel's compilers, which aren't included with our operating system by default, I can simply execute module load intel-compilers:

$ module list
No modules loaded
$ icc --version
-bash: icc: command not found
$ module load intel-compilers
$ module list

Currently Loaded Modules:
  1) intel-compilers/2017

$ icc --version
icc (ICC) 17.0.4 20170411
Copyright (C) 1985-2017 Intel Corporation.  All rights reserved.

Changes made using modules take effect in your current shell immediately (you do not have to log out and back in); changes are not persistent across login shells, unless you save them as described in "Advanced Usage" below.

Syntax and Usage

Most module commands are of the format module COMMAND [OPTION]. Here are some of the commonly used commands:

Command Functionality
module list List loaded modules
module avail [pattern] List available modules [containing the specified pattern]
module spider [modulename] List all modules, or show how to load the specified module
module load modulename Load the specified module
module unload modulename Unload the specified module
module purge Unload all currently loaded modules
module whatis modulename Get a brief summary about the module, if one is provided

A more complete list is available here.

Module Conflicts and Dependencies

Conflicts

Loading two versions of the same software simultaneously could cause unpredictable issues, and in some cases, different programs should not interact. Lmod allows modules to conflict, which means that they cannot be loaded together. When the conflict is between differing versions of the same software, Lmod will take care of the switch for you; when different programs conflict, manual resolution is needed:

$ module load gcc/6
$ module load gcc/7

The following have been reloaded with a version change:
  1) gcc/6 => gcc/7

$ module load conflicts_with_gcc
Lmod has detected the following error:  Cannot load module "conflicts_with_gcc" because these module(s) are loaded:
   gcc

While processing the following module(s):
    Module fullname     Module Filename
    ---------------     ---------------
    conflicts_with_gcc  /apps/modulefiles/Linux/conflicts_with_gcc.lua

$ module unload gcc/7
$ module load conflicts_with_gcc
$ module list

Currently Loaded Modules:
  1) conflicts_with_gcc

Hierarchy

Sometimes, it doesn't make sense to load a module if its "parent" module isn't loaded; for example, cuDNN would be of no use without CUDA. One of Lmod's tools for dealing with this is module hierarchy--only making a module available once its "parent" module has been loaded. cuDNN is not found by module avail until CUDA has been loaded:

$ module list
No modules loaded
$ module avail cudnn
No modules found!
Use "module spider" to find all possible modules.
Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".

$ module load cudnn  # I know it's there, somewhere!
Lmod has detected the following error:  These module(s) exist but cannot be loaded as requested: "cudnn"
   Try: "module spider cudnn" to see how to load the module(s).

$ module load cuda
$ module load cudnn
$ module list

Currently Loaded Modules:
  1) cuda/9.2   2) cudnn/7.1

Note that when we tried to load cuDNN before loading CUDA, we were told that the module exists and prompted to try module spider cudnn. This indicates that the cudnn module is there, but isn't yet available. Attempting to load a module that does not exist results in a different message:

$ module load not_a_module
Lmod has detected the following error:  The following module(s) are unknown: "not_a_module"

Please check the spelling or version number. Also try "module spider ..."
It is also possible your cache file is out-of-date; it may help to try:
  $ module --ignore-cache load "not_a_module"

Also make sure that all modulefiles written in TCL start with the string #%Module

When we do as prompted and use module spider, we are given helpful instructions on how to load the module:

$ module spider cudnn

------------------------------------------------------------------------------------------------------------------------
  cudnn:
------------------------------------------------------------------------------------------------------------------------
     Versions:
        cudnn/6.0
        cudnn/7.0
        cudnn/7.1

------------------------------------------------------------------------------------------------------------------------
  For detailed information about a specific "cudnn" module (including how to load the modules) use the module's full name.
  For example:

     $ module spider cudnn/7.1
------------------------------------------------------------------------------------------------------------------------

$ module spider cudnn/7.1

------------------------------------------------------------------------------------------------------------------------
  cudnn: cudnn/7.1
------------------------------------------------------------------------------------------------------------------------

    You will need to load all module(s) on any one of the lines below before the "cudnn/7.1" module is available to load.

      cuda/9.2

Using module hierarchy allows us to avoid ungainly naming schemes--for example, were there to be two OpenMPI 3.1 modules, one built with GCC 6 and one built with GCC 7, we would need to name them something like openmpi/3.1_gcc-6 and openmpi/3.1_gcc-7 absent module hierarchy. With module hierarchy, only one of these modules will ever be available at once, so they can both be named openmpi/3.1. This has other advantages, as discussed below.

Prerequisites

Software occasionally depends on multiple independent chains of other software. For example, mpi4py requires both OpenMPI and Python in order to be of any use. In such cases, we set up module hierarchy with one of the software chains, and put a prerequisite on the other chain(s). mpi4py becomes available as soon as Python is loaded, independent of whether OpenMPI is loaded:

$ module list
No modules loaded
$ module avail mpi4py
No modules found!
Use "module spider" to find all possible modules.
Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".

$ module load python/3.6
$ module avail mpi4py

-------------------------------------------- /apps/modulefiles/.python-3.6 --------------------------------------------
   mpi4py/3.0

Use "module spider" to find all possible modules.
Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".

However, when we try to load mpi4py, it tells us that OpenMPI is needed; we can then load OpenMPI, which will allow mpi4py to be loaded:

$ module load mpi4py/3.0
Lmod has detected the following error:  Cannot load module "mpi4py/3.0" without these module(s) loaded:
   openmpi/3.1

While processing the following module(s):
    Module fullname  Module Filename
    ---------------  ---------------
    mpi4py/3.0       /apps/modulefiles/.python-3.6/mpi4py/3.0.lua

$ module load gcc/7 openmpi/3.1  # openmpi/3.1 requires gcc/7
$ module load mpi4py/3.0
$ module list

Currently Loaded Modules:
  1) python/3.6   2) gcc/7   3) openmpi/3.1   4) mpi4py/3.0

Swap Modules and Toolchains

Module hierarchy allows Lmod to swap modules intelligently, updating toolchains as it does. For example, cuDNN 7.0 depends on CUDA 9.0, while cuDNN 7.1 depends on CUDA 9.2. When we load up a CUDA/cuDNN pair, then swap CUDA (the base of this toolchain), the cuDNN module is updated as well; likewise for GCC/OpenMPI pairs:

$ module load gcc/6 openmpi
$ module load cuda/9.0 cudnn
$ module swap cuda/9.2

The following have been reloaded with a version change:
  1) cuda/9.0 => cuda/9.2     2) cudnn/7.0 => cudnn/7.1

$ module swap gcc/7

The following have been reloaded with a version change:
  1) gcc/6 => gcc/7     2) openmpi/3.0 => openmpi/3.1

Advanced Usage

Load Modules in a Job Script

A job script's environment is inherited from the shell that submits the job. This means that any modules loaded when a job was submitted will be available in that job. This can be convenient in a few cases, but most of the time some static set of modules is needed by a job script. To ensure that the same set of modules is loaded every time a job script runs, one can simply purge, then load the modules needed within that script:

#!/bin/bash

#SBATCH --time=00:30:00
#SBATCH --mem=1G
#SBATCH --ntasks=1

module purge  # anything inherited from the submitting shell is purged
module load gcc openmpi python  # these will be the only loaded modules

echo "Loaded modules:"
module list
echo "Ha! Your shell's loaded modules do nothing here!"

Note that these changes will not be reflected in the shell from which the job was submitted.

Make Persistent Changes

Non-default modules must be loaded every time one logs in; for users who almost always use the same set of (non-default) modules, this can become tedious. Lmod provides module save, which allows one to save the currently loaded set of modules as the default set:

$ ssh myname@rhel7ssh.fsl.byu.edu
Last login: Thu Aug 23 09:34:34 2018 from 10.32.37.33
Fulton Supercomputing Lab

Support: https://marylou.byu.edu/ticket/ or fslsupport@byu.edu

By using this system, you agree to abide by the Supercomputing usage
policy. See https://marylou.byu.edu/documentation/principles for details.

$ module list

Currently Loaded Modules:
  1) lmod/7.5.13   2) defaultenv/10 (H)

  Where:
   H:  Hidden Module

$ module purge
$ module load gcc openmpi python
$ module save
Saved current collection of modules to: "default"

$ exit
logout
Connection to rhel7ssh.fsl.byu.edu closed.
$ ssh myname@rhel7ssh.fsl.byu.edu
Last login: Thu Aug 23 09:34:34 2018 from 10.32.37.33
Fulton Supercomputing Lab

Support: https://marylou.byu.edu/ticket/ or fslsupport@byu.edu

By using this system, you agree to abide by the Supercomputing usage
policy. See https://marylou.byu.edu/documentation/principles for details.

$ module list

Currently Loaded Modules:
  1) gcc/7   2) openmpi/3.1   3) python/3.6

If you have multiple sets of modules that you would like to have quick access to, you can create collections with module save collection_name, and later access those collections with module restore collection_name:

$ module purge
$ module load cuda cudnn
$ module save nvidia_collection
Saved current collection of modules to: "nvidia_collection"

$ module purge
$ module list
No modules loaded
$ module restore nvidia_collection
Restoring modules from user's nvidia_collection
$ module list

Currently Loaded Modules:
  1) cuda/9.2   2) cudnn/7.1

Using module restore without specifying a collection name will load the default set of modules.

A few more useful commands associated with collections:

Command Functionality
module describe [collection] Print the contents of collection, or default if none is provided
module savelist List saved collections
module disable [collection] Disable collection, or default if none is provided

Collections are saved in ~/.lmod.d. Note that if you accidentally disable a collection, it is not removed--a '~' is simply appended to the name. To restore a collection that you've disabled, for example "my_collection":

$ module disable my_collection  # whoops!
Disabling my_collection collection by renaming with a "~"
$ module savelist
Named collection list :
  1) default
$ cd ~/.lmod.d
$ mv my_collection~ my_collection
$ module savelist
Named collection list :
  1) default  2) my_collection

Make Your Own Modulefiles

Most modulefiles simply prepend or append to a few environment variables, and Lua's syntax is straightforward. As an example, here is the gcc/7 modulefile (/apps/modulefiles/Linux/gcc/7.lua), with some comments added:

whatis("GCC Compiler Collection") -- `module whatis gcc/7` prints this message

family("gcc") -- "conflicts" with other gcc modules, but the swap is automatic

local prefix = "/apps/gcc/7.3.0"

prepend_path("PATH", prefix .. "/bin") -- make gcc/g++/etc binaries findable
prepend_path("MANPATH", prefix .. "/share/man") -- make the man pages findable
prepend_path("LD_RUN_PATH", prefix .. "/lib64") -- ...long story

fsl.prepend_modulepath(".gcc-7") -- this is for module hierarchy

Most users simply want binaries that they installed to be found by the shell automatically; if you want more than that, you'll have to read the docs or contact us--a full tutorial on creating modulefiles is beyond the scope of this article.

Suppose I've just installed the package foo in ~/apps/foo, and I want to be able to use the included binary, bar (along with bas and bat), without having to type ~/apps/foo/bin/bar every time I use it. If I prepend ~/apps/foo/bin to the PATH environment variable, the shell will search there each time I run a command--when I simpy type bar, the shell will look in ~/apps/foo/bin, find bar, and execute it; likewise with the other binaries in ~/apps/foo/bin, bas and bat. For that end, this modulefile would be perfectly sufficient:

whatis("My first modulefile! Gives access to foo's bar, bas, and bat binaries")
local prefix = "~/apps/foo"
prepend_path("PATH", prefix .. "/bin")

Once the foo module is loaded, you can simpy type bar, bas, or bat, and they will execute automagically. To enable the loading of this module, we need to help Lmod find it. If I save the modulefile above as ~/.modulefiles/foo.lua, I need to add ~/.modulefiles to the environment variable MODULEPATH. This can be done directly, but the easiest way involves module use:

$ ls $HOME/.modulefiles
foo.lua
$ module purge
$ module load foo
Lmod has detected the following error:  The following module(s) are unknown: "foo"

Please check the spelling or version number. Also try "module spider ..."
It is also possible your cache file is out-of-date; it may help to try:
  $ module --ignore-cache load "foo"

Also make sure that all modulefiles written in TCL start with the string #%Module

$ module use $HOME/.modulefiles
$ module load foo
$ module list

Currently Loaded Modules:
  1) foo

If you want to make persistent changes, module save can be used in combination with module use; simply use the path(s) to your modulefies, load up the modules you want to be default, and save:

$ module purge
$ echo $MODULEPATH
/apps/modulefiles/Linux:/apps/modulefiles/Core:/apps/lmod/lmod/modulefiles/Core
$ module use $HOME/.modulefiles
$ echo $MODULEPATH
/fslhome/myuser/.modulefiles:/apps/modulefiles/Linux:/apps/modulefiles/Core:/apps/lmod/lmod/modulefiles/Core
$ module load gcc/7 python/3.6
$ module save  # current modules and MODULEPATH saved
Saved current collection of modules to: "default"

The path to your modulefiles will now be prepended to MODULEPATH automatically on each login.