HowTo:Migrate

From CAC Wiki
Revision as of 18:55, 29 November 2016 by Hasch (Talk | contribs) (Migrating from Sparc/Solaris to x86/Linux)

Jump to: navigation, search

Migrating from Sparc/Solaris to x86/Linux

This is a basic guide fro users of our de-commisioned Solaris/Sparc systems who want to continue their work on the current Linux/x86 main cluster.

Access

The login node for the Linux nodes is swlogin1. It may be accessed in two different ways:

  • From the default login node sflogin0 (which stgill runs on Solaris) by secure shell:
ssh -X swlogin1
. Re-issuing the password will be required.
  • Directly from the Secure Global Desktop through the xterm (sxwlogin1) application.

For people used to work on sflogin0, this iomplies an additional "node hop" to swlogin1.

Shell Setup

There are several set-up files in your home directory:

  • .bashrc is "sourced in" every time a bash shell is invoked.
  • .bash_profile applies only to login shells, i.e. when you access the system from outside.

Most of the setup is automatic through usepackage. On login, you have a default setup that is appropriate for a Linux system. Additional packages can be set up by adding commands such as

use anaconda3

to the above setup files, if you want to use the Python 3 distribution "Anaconda" (as an example). Note that this is the same as it was on Solaris, but that the available packages may differ. For a list, use the

use -l

command.

Compiling Code

The standard Fortran/C/C++ compilers differ between the Solaris and the Linux systems. The ones on the x86/Linux platform are discussed here. Here is a comparison in table form. Since there are two compilers (gnu and Intel) on the Linux platform, they are treated separately. The default is gnu. We also list the MPI - related commands for setup, compilation, and runtime.

Fortran/C/C++ Compilers Sparc/Solaris to x86/Linux
Sparc/Solaris x86/Linux (gnu) x86/Linux (Intel)
Name/Version Studio 12.4 Gnu gcc 4.4.7 Intel 12.1
Setup command none (default) none (default) use icsmpi
MPI setup none (default) use openmpi use icsmpi
Fortran / C / C++ compilers f90 / cc / CC gfortran / gcc / g++ ifort / icc / icpc
MPI compoiler wrappers mpif90 / mpicc / mpiCC mpif90 / mpicc / mpicxx mpiifort / mpiicc / mpiicpc
MPI runtime environment mpirun mpirun mpirun

Note that all programs that were running on the Solaris platform have to be re-compiled on Linux. Binaries are not compatible as they are based on different instruction sets.

MPI

On both Solaris and Linux systems, the MPI distribution used is OpenMPI. On the Solaris platform this was integrated with the standard Studio compilers. On the Linux platform, two versions are in use:

  • A stand-alone version of OpenMPI 1.8 is used in combination with the gcc compiler and setup through the use openmpi command.
  • A second version (Intel 4.0 update 3) is used with the Intel compilers and set up together with them ("use icsmpi")

All of these versions use the mpirun command to invoke the runtime environment. Check with which mpirun to see which version you are currently using.

Binary Formats

Important: Some programs use binary format for data I/O. These files are likely not compatible between the two platforms, which means that it may be necessary to re-run the programs on the new platform or convert the data files before using them. This is due the different "Endianness" on the two platforms: Sparc/Solaris is big-Endian and x86/Linux is little-Endian. If you encounter issues with data files, please get in touch with us.

Scheduling

Both the "old" M9000 servers and the "new" SW (Linux) cluster use Sun Grid Engine as a scheduler. Please consult our Scheduler Help File for details about its usage. The following table gives an overview of the alterations that need to be made to a submission script if execution is to take place on the Linux production nodes, i.e. the "SW cluster".

Changes in SGE submissions when migrating from Sparc/Solaris to x86/Linux
Sparc/Solaris x86/Linux
Queue name m9k.q (old default, deprecated) abaqus.q
Node names m9k000* sw00**, cac0**
Login node for
submission
sflogin0 swlogin1
Rel. Serial Execution Speed 1 3-6
Suggested Relative Nprocs 1 1/2
Queue specification
in submit script
none
#$ -q abaqus.q
#$ -l qname=abaqus.q
Gaussian Parallel environment
#$ -pe gaussian.pe
#$ -pe glinux.pe
Gaussian Setup line
. /opt/gaussian/setup.sh
. /opt/gaussian/setup.sh

Since for historical reasons the M9000 servers were associated with the default queue of the scheduler, additional lines have to be included to add the abaqus.q queue (which is associated with the Linmux cluster and force scheduling through that queue. Also note that it is strongly suggested to lower the number of processes requested when submitting to the SW cluster. This is because the nodes are substantially smaller than then the M9000 servers, but provide greatly improved per-core performance. This means that even with half the core count, a speedup of 2-3 is likely.

We have added some entries to the table describing modifications that apply only for submissions of jobs running the Computational Chemistry software Gaussian. For more details about this software, please consult our Gaussian Help File. Gaussian submissions go to a dedicated large node on the SW cluster that uses local scratch space to improve performance and avoid bandwidth issues with IO.

Help

If you have questions that you can't resolve by checking documentation, email to cac.help@queensu.ca.