Software:Frontenac

From CAC Wiki
Revision as of 16:27, 26 March 2018 by Jstaff (Talk | contribs)

Jump to: navigation, search

The Frontenac cluster includes a wide variety of software and compilers. There are several new ways of accessing and using software, which are documented here. For a list of software available on the SW cluster, please see our SW software page.

The "module" system

Frontenac uses a new method of loading software compared to the SW cluster, the Lmod modules system. We have switched to this system to ensure consistency with other Compute Canada systems and provide a better user interface to our software. The modules system uses more or less the same concepts as the "use" system in use on the SW cluster.

On a large compute cluster, it is impossible to have all sets of software loaded all the time by default. Some software has multiple versions, some packages conflict with each other, and some pieces of software need to be configured separately for different use cases. Environment modules are designed to solve this problem, by treating each software package and all of its associated files as a distinct package to be loaded on demand. Modules also handle the loading of dependencies. For instance, loading the R programming language would be done by loading the "r" module - any dependencies would be handled behind the scenes by the module system without any user intervention.



How to use the module system

What you want to do Lmod command (Frontenac cluster) "Use" command (SW cluster)
See all available software module avail use -l
See a short description of what each package does module spider <no equivalent>
Load the software package "packageName" module load packageName use packageName
Use a specific version of a software package module load packageName/version use packageName-version
View currently loaded packages module list <no equivalent>
Unload a package module unload packageName <no equivalent>
Unload all packages module purge use none

Please note that all commands are case-sensitive. For an extremely comprehensive set of documenation on using the module system (such as how to write your own modules), you can refer to the official Lmod documentation here: http://lmod.readthedocs.io/en/latest/

Local vs. Compute Canada software

Software on the Frontenac cluster can come from two locations: locally or from Compute Canada's centralized software stack. The Compute Canada software stack is standardized, and contains a set of software that is identically compiled and setup across every cluster it is installed on. This is a fantastic tool for reproducibility and scaling your work across multiple clusters: the same software will work the same way, regardless of where you are using it. There is also a large amount of locally installed software as well. This is how most software requiring licensing or other special local considerations is installed. Using both sets of software is identical- just run module load softwareName.

Please note that if you are the first user to use a Compute Canada software package on a node (or it has not been used in some time), the software may initially appear to "hang" and do nothing for several seconds on launch. This is normal - the software is being re-downloaded and cached on the local system. To tell if a piece of software being used is coming from this centralized stack, you can run which <some_command>. If the output begins with /cvmfs, it is part of the Compute Canada software stack.

List of all installed software

For a reasonably up-to date list of installed software, please check the Compute Canada software page here. The most up-to-date list will be the module system itself. To see all available packages that do not conflict with the current environment, run the module avail command. Typically packages not compatible with the default environment will be hidden (such as those compiled using the GNU compilers instead of the Intel compiler defaults).


Note that if you just want to load the default version of a piece of software you do not need to include the version when loading a module. For instance, module load gcc will load the default version of GCC (version 5.4.0).

Finding a specific package with "module spider"

To check if a package exists (compatible or not), use module spider packagename. This will typically load a list of package versions, and instructions on how to load each (dependencies may change between versions). Here's an example of how a user might find and use the OpenCV library.

First the user looks for packages. module avail shows way too many packages, and module avail opencv shows no match. One nice thing to note here is that module names are always lowercase.

[user@caclogin02 ~]$ module avail

-------------------------------------------------- /global/software/lmod/modules --------------------------------------------------
   abaqus/2017       (phys,D)    bayescan/g++540                   flexbar/3.0.3                   hwt/hwt_7.2.gnu
   adf/2017_108      (chem)      chapel/1.15.0         (t)         freesurfer/6.0.0                hwt/hwt_7.2.intel (D)
   afni/17.3.05                  cisc875/default                   fsl/5.0.10          (D)         ics/2017u1
   agouti/v0.3.3                 comsol/53             (phys,D)    gaussian/g09e1_sse4 (chem)      matlab/R2017a     (t)
   allpaths-lg/52488 (bio)       dotnet/1.0.4                      gaussian/g16a3_sse4 (chem)      pyrx/094
   anaconda/2.7.13               dotnet/1.1.4                      gaussian/g16b1_sse4 (chem,D)    qiime/1.9.1
   anaconda/3.5.3    (D)         dotnet/2.0.0          (D)         gurobi/752                      redundans/a6621dc
   ansys/ansys181    (phys,D)    faststructure/default             hisat2/2.1.0        (bio,D)

--------------------------------------------------- MPI-dependent avx2 modules ----------------------------------------------------
   abinit/8.2.2               (chem)      lumpy/0.2.13                 (bio)       plumed/2.3.2          (chem,D)
   abinit/8.4.4               (chem,D)    meep/1.3                     (phys)      pnetcdf/1.8.1         (io)
   abyss/1.5.2                (bio)       mpe2/2.4.9b                  (m)         psi4/1.1              (chem)
# more output omitted for brevity

[user@caclogin02 ~]$ module avail opencv
No modules found!
Use "module spider" to find all possible modules.
Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".

Instead, the user should use module spider opencv to find if the OpenCV module is present.

[user@caclogin02 ~]$ module spider opencv

-------------------------------------------------------------------------------------------------------------------------------
  opencv:
-------------------------------------------------------------------------------------------------------------------------------
    Description:
      OpenCV (Open Source Computer Vision Library) is an open source computer vision and machine learning software library.
      OpenCV was built to provide a common infrastructure for computer vision applications and to accelerate the use of
      machine perception in the commercial products.

     Versions:
        opencv/2.4.13.3
        opencv/3.3.0

-------------------------------------------------------------------------------------------------------------------------------
  For detailed information about a specific "opencv" module (including how to load the modules) use the module's full name.
  For example:

     $ module spider opencv/3.3.0
-------------------------------------------------------------------------------------------------------------------------------

This command shows multiple versions, we'll get how to use the latest 3.3.0 version:

[user@caclogin02 ~]$ module spider opencv/3.3.0

-------------------------------------------------------------------------------------------------------------------------------
  opencv: opencv/3.3.0
-------------------------------------------------------------------------------------------------------------------------------
    Description:
      OpenCV (Open Source Computer Vision Library) is an open source computer vision and machine learning software library.
      OpenCV was built to provide a common infrastructure for computer vision applications and to accelerate the use of
      machine perception in the commercial products.

    Properties:
      Visualisation software / Logiciels de visualisation

    You will need to load all module(s) on any one of the lines below before the "opencv/3.3.0" module is available to load.

      nixpkgs/16.09  gcc/5.4.0
      nixpkgs/16.09  gcc/5.4.0  cuda/8.0.44
 
    Help:
      
      Description
      ===========
      OpenCV (Open Source Computer Vision Library) is an open source computer vision
       and machine learning software library. OpenCV was built to provide
       a common infrastructure for computer vision applications and to accelerate
       the use of machine perception in the commercial products.
      
      
      More information
      ================
       - Homepage: http://opencv.org/

In this case, it looks like nixpkgs and gcc are the two dependencies for OpenCV. The default intel module is incompatible with gcc which is why these packages are hidden by default. Let's try loading the dependencies and then the "opencv" module.

[user@caclogin02 ~]$ module load nixpkgs/16.09  gcc/5.4.0

Lmod is automatically replacing "intel/2016.4" with "gcc/5.4.0".


Due to MODULEPATH changes, the following have been reloaded:
  1) openmpi/2.1.1     2) r/3.4.3

[user@caclogin02 ~]$ module load opencv

The OpenCV module now works as advertised. (We'll demonstrate using the OpenCV Python library)

[user@caclogin02 ~]$ module load python

The following have been reloaded with a version change:
  1) python/3.5.2 => python/3.5.4

[user@caclogin02 ~]$ python3
Python 3.5.4 (default, Dec  4 2017, 16:30:40) 
[GCC 5.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>> cv2.__version__
'3.3.0'
>>> 

To sum things up, the user could access the opencv module with the following in their jobs:

# nixpkgs is always loaded unless you completely unload the software stack, 
# and gcc/5.4.0 is the default version of GCC
module load gcc
module load opencv
module load python
python3 opencv_script.py

Completely unloading the software stack

The Compute Canada software stack provides a complete set of binaries, libraries, development headers, and other system utilities that might interfere with things if you want to compile against or use the default, "vanilla" Linux installation. To completely unload all traces of the Compute Canada software stack, you can use the following command:

[user@caclogin02 ~]$ module purge --force
[user@caclogin02 ~]$ module list
No modules loaded

To re-load the software stack:

[user@caclogin02 ~]$ module load StdEnv
[user@caclogin02 ~]$ module list

Currently Loaded Modules:
  1) nixpkgs/16.09   (S)   3) gcccore/.5.4.0    (H)   5) intel/2016.4    (t)      7) openmpi/2.1.1 (m)
  2) icc/.2016.4.258 (H)   4) ifort/.2016.4.258 (H)   6) imkl/11.3.4.258 (math)   8) StdEnv/2016.4 (S)

  Where:
   S:     Module is Sticky, requires --force to unload or purge
   m:     MPI implementations / Implémentations MPI
   math:  Mathematical libraries / Bibliothèques mathématiques
   t:     Tools for development / Outils de développement
   H:                Hidden Module