Difference between revisions of "HowTo:Migrate"

From CAC Wiki
Jump to: navigation, search
(Compiling Code)
 
(18 intermediate revisions by one other user not shown)
Line 1: Line 1:
 
= Migrating from Sparc/Solaris to x86/Linux =
 
= Migrating from Sparc/Solaris to x86/Linux =
  
This is an introduction to setting up your account on our systems. When first logging in, you are presented with a default set-up that enables the use of basic system commands, simple compilers, and access to the scheduler. This help file is meant to explain how to modify that default.
+
This is a basic guide for former users of our de-commisioned Solaris/Sparc systems who want to continue their work on the current Linux/x86 main cluster.
  
 
{|  style="border-spacing: 8px;"
 
{|  style="border-spacing: 8px;"
Line 8: Line 8:
 
== Access ==
 
== Access ==
  
The login node for the Linux nodes is '''swlogin1'''. It may be accessed in two different ways:
+
The login node for the Linux nodes is '''swlogin1'''. It may be accessed From the default login node '''sflogin0''' (which still runs on Solaris) by secure shell:
 +
<pre>ssh -X swlogin1</pre>. Re-issuing the password will be required.
  
* From the default login node '''sflogin0''' (which stgill runs on Solaris) by secure shell:<pre>
 
ssh -X swlogin1</pre>. Re-issuing the password will be required.
 
* Directly from the Secure Global Desktop through the '''xterm (sxwlogin1)''' application.
 
  
 
For people used to work on sflogin0, this iomplies an additional "node hop" to swlogin1.
 
For people used to work on sflogin0, this iomplies an additional "node hop" to swlogin1.
Line 38: Line 36:
 
== Compiling Code ==
 
== Compiling Code ==
  
The standard Fortran/C/C++ compilers differ between the Solaris and the Linux systems. [[HowTo:Compilers|The ones on the x86/Linux platform are discussed here]]. Here is a comparison in table form.
+
The standard Fortran/C/C++ compilers differ between the Solaris and the Linux systems. [[HowTo:Compilers|The ones on the x86/Linux platform are discussed here]]. Here is a comparison in table form. Since there are two compilers ('''gnu''' and '''Intel''') on the Linux platform, they are treated separately. The default is '''gnu'''. We also list the MPI - related commands for setup, compilation, and runtime.
  
 
{| class="wikitable" style="float:left; margin-right: 25px;"
 
{| class="wikitable" style="float:left; margin-right: 25px;"
Line 45: Line 43:
 
|  
 
|  
 
|'''Sparc/Solaris'''  
 
|'''Sparc/Solaris'''  
|'''x86/Linux'''
+
|'''x86/Linux (gnu)'''
|
+
|'''x86/Linux (Intel)'''
 
|-
 
|-
 
| '''Name/Version'''
 
| '''Name/Version'''
 
| Studio 12.4
 
| Studio 12.4
| Gnu gcc 4.4.3 / Intel 12.1
+
| Gnu gcc 4.4.7
|
+
| Intel 12.1
 +
|-
 +
| '''Setup command'''
 +
| none (default)
 +
| none (default)
 +
| use icsmpi
 +
|-
 +
| '''MPI setup'''
 +
| none (default)
 +
| use openmpi
 +
| use icsmpi
 +
|-
 +
| '''Fortran / C / C++ compilers
 +
| f90 / cc / CC
 +
| gfortran / gcc / g++
 +
| ifort / icc / icpc
 +
|-
 +
| '''MPI compoiler wrappers'''
 +
| mpif90 / mpicc / mpiCC
 +
| mpif90 / mpicc / mpicxx
 +
| mpiifort / mpiicc / mpiicpc 
 +
|-
 +
|'''MPI runtime environment'''
 +
| mpirun
 +
| mpirun
 +
| mpirun
 
|}
 
|}
  
== Manual Set-Up ==
+
Note that '''all''' programs that were running on the Solaris platform have to be re-compiled on Linux. Binaries are not compatible as they are based on different instruction sets.
  
You can of course apply settings directly without using "use".
+
== MPI ==
  
One of the most important environment variables is '''PATH''', which tells the system where to look for the commands you issue. You may want to make your shell aware of some directories with system commands and shell commands in them.
+
On both Solaris and Linux systems, the MPI distribution used is OpenMPI. On the Solaris platform this was integrated with the standard Studio compilers. On the Linux platform, two versions are in use:
 
+
* A stand-alone version of OpenMPI 1.8 is used in combination with the gcc compiler and setup through the '''use openmpi''' command.
Another environment variable that is often useful is '''MANPATH'''. This is for the Unix manual pages, and tells the system where to look for online-documentation.
+
* A second version (Intel 4.0 update 3) is used with the Intel compilers and set up together with them ("use icsmpi")
 
+
All of these versions use the '''mpirun command''' to invoke the runtime environment. Check with '''which mpirun''' to see which version you are currently using.
Yet another one is '''LD_LIBRARY_PATH''', which is sometimes used by applications to find dynamic runtime libraries. If you experience problems with missing libraries, try playing with LD_LIBRARY_PATH, otherwise it's best left unset.
+
 
+
The command to set an EV is the binary operator '='. This is often followed by the '''export''' command, which makes the variable part of the environment:
+
<pre>
+
VARIABLE=VALUE
+
export VARIABLE
+
</pre>
+
 
+
Note that it is possible to place "export" in front of the variable assignment instead of issuing two separate commands:<pre>export VARIABLE=VALUE</pre>
+
 
+
To access the value of a environment variable, place a "$" in front of it. For example you want to see which value your variable PATH has, type <pre>echo $PATH </pre> where "echo" is a standard Unix command, and "$PATH" returns the value of PATH. The following command will append something to a previously defined variable:
+
 
+
<pre>export PATH=$PATH":/yet/another/directory</pre>
+
 
+
Here, "PATH" is the variable and "$PATH" is its present value.
+
 
+
Sometimes a variable needs to be reset for a specific application. It is then best to write a shell script that sets the variables and starts the application, rather than setting the variables globally in your start-up files. In this example:
+
 
+
<pre>OMP_NUM_THREADS=8 omp_program</pre>
+
 
+
The variable OMP_NUM_THREADS (which determines how many threads are being used) is set '''only''' for this specific run of the multi-threaded program omp_program.
+
 
+
You can consult the configuration files of "usepackage" /opt/usepackage/etc/usepackage.conf to find out which setting you need to apply to run a specific software or access certain features. The syntax in that file is not hard to read, for instance the entry
+
 
+
<pre>
+
>> g09 : "Gaussian 09 Update E.01" <<
+
g09 : g09root = "/opt/gaussian/g09e1",
+
      <[ . /opt/gaussian/g09e1/g09/bsd/g09.profile ]>,
+
      GAUSS_SCRDIR = "/scratch/$LOGNAME";
+
</pre>
+
 
+
is responsible for the settings in the "Gaussian" example in the previous section.
+
 
|}
 
|}
 
{|  style="border-spacing: 8px;"
 
{|  style="border-spacing: 8px;"
 
| valign="top" width="50%" style="padding:1em; border:1px solid #aaaaaa; background-color:#e1eaf1; border-radius:7px" |
 
| valign="top" width="50%" style="padding:1em; border:1px solid #aaaaaa; background-color:#e1eaf1; border-radius:7px" |
== Running pre-installed software ==
 
  
A lot of software is pre-installed on our clusters. Some of this software requires specific license agreements, other programs are freely accessible. With the use command, most of them can be set up with a single line such as "use fluent" in your shell's start-up file. If the software you want to run is not included in our usepackage list, please contact us, and we can include it. If you are using very specific software that is not accessed by other users, you might have to do the setup manually.
+
== Binary Formats ==
  
Here is a few steps to follow in that case.
+
'''Important:''' Some programs use binary format for data I/O. These files are likely not compatible between the two platforms, which means that it may be necessary to re-run the programs on the new platform or convert the data files before using them. This is due the different [https://en.wikipedia.org/wiki/Endianness "Endianness"] on the two platforms: Sparc/Solaris is '''big-Endian''' and x86/Linux is '''little-Endian'''. If you encounter issues with data files, please [mailto:cac.help@queensu.ca get in touch with us].
  
* Check out the '''documentation''' for the specific program, including users' manuals and home pages.
+
== Scheduling ==
* Inform yourself about '''licensing'''. Some software requires each individual user to hold a license, some is covered by a collective license agreement, some does not require a license at all. For example, the finite-element structural code "Abaqus" is only accessible to users who work at an institution that is covered by a local license, whereas the license agreement for the electronic-structure code "Gaussian" covers all our HPCVL users. Finally, code such as "Gamess" (another quantum-chemistry program) are free to use by all users, although the distributor encourages registration.
+
* Set the proper '''environment variables'''. This can usually be done in your shell setup files, since you'll be running the same code on most occasions you log on. These variables might include the PATH, but also variables specific for the program in question. Which ones to set you will be able to find out in most cases from the program documentation. Remember that this is only necessary if no entry exists in the "usepackage" configuration file, which can be checked by running "use -l".
+
  
==How do I run parallel code ?==
+
Both the "old" M9000 servers and the "new" SW (Linux) cluster use Sun Grid Engine as a scheduler. Please consult [[HowTo:Scheduler|our Scheduler Help File]] for details about its usage. The following table gives an overview of the alterations that need to be made to a submission script if execution is to take place on the Linux production nodes, i.e. the "SW cluster".
 +
 
 +
{| class="wikitable" style="float:left; margin-right: 25px;"
 +
!colspan="3"| '''Changes in SGE submissions when migrating from Sparc/Solaris to x86/Linux'''
 +
|-
 +
|
 +
|'''Sparc/Solaris'''
 +
|'''x86/Linux'''
 +
|-
 +
| '''Queue name'''
 +
| m9k.q (old default, deprecated)
 +
| abaqus.q (new default)
 +
|-
 +
| '''Node names'''
 +
| m9k000*
 +
| sw00**, cac0**
 +
|-
 +
| '''Login node for <br> submission'''
 +
| sflogin0
 +
| swlogin1
 +
|-
 +
| '''Rel. Serial Execution Speed'''
 +
| 1
 +
| 3-6
 +
|-
 +
| '''Suggested Relative Nprocs'''
 +
| 1
 +
| 1/2
 +
|-
 +
| '''Queue specification <br> in submit script'''
 +
| none
 +
| none
 +
|-
 +
| '''Gaussian Parallel environment'''
 +
| <pre>#$ -pe gaussian.pe</pre>
 +
| <pre>#$ -pe glinux.pe</pre>
 +
|-
 +
| '''Gaussian Setup line'''
 +
| <pre>. /opt/gaussian/setup.sh</pre>
 +
| <pre>. /opt/gaussian/setup.sh</pre>
 +
|}
  
That depends on how the code is "parallelized":
+
Note that it is strongly suggested to '''lower the number of processes''' requested when submitting to the SW cluster. This is because the nodes are substantially smaller than then the M9000 servers, but provide greatly improved per-core performance. This means that even with half the core count, a speedup of 2-3 is likely.
* If it was "multi-threaded" by the compiler (automatic or via compiler directives), it is usually enough to set the environment variable PARALLEL or OMP_NUM_THREADS to the number of threads that should be used.
+
* If it is MPI code, a special parallel runtime environment has to be used. The command there is mpirun, which has command-line options that let you tell how many and which processors to use. This command is part of the Cluster Tools parallel runtime environment. Cluster Tools involves a good deal of commands that let you modify the condition under which your program runs. The settings for these are included in our default setup.
+
  
You can learn more about parallel code by having a look at our [[FAQ:Parallel|Parallel Programming FAQ]]. We also have a bit more specific information about  parallel programming tools, namely [[FAQ:OpenMP|OpenMP compiler directives]] and the [[FAQ:MPI|Message Passing Interface (MPI)]].
+
We have added some entries to the table describing modifications that apply only for submissions of jobs running the Computational Chemistry software '''Gaussian'''. For more details about this software, please consult our [[HowTo:gaussian|Gaussian Help File]]. Gaussian submissions go to a dedicated large node on the SW cluster that uses local scratch space to improve performance and avoid bandwidth issues with IO.
  
 
== Help ==
 
== Help ==
 
If you have questions that you can't resolve by checking documentation, [mailto:cac.help@queensu.ca email to cac.help@queensu.ca].
 
If you have questions that you can't resolve by checking documentation, [mailto:cac.help@queensu.ca email to cac.help@queensu.ca].
 
|}
 
|}

Latest revision as of 18:16, 29 August 2017

Migrating from Sparc/Solaris to x86/Linux

This is a basic guide for former users of our de-commisioned Solaris/Sparc systems who want to continue their work on the current Linux/x86 main cluster.

Access

The login node for the Linux nodes is swlogin1. It may be accessed From the default login node sflogin0 (which still runs on Solaris) by secure shell:

ssh -X swlogin1
. Re-issuing the password will be required.


For people used to work on sflogin0, this iomplies an additional "node hop" to swlogin1.

Shell Setup

There are several set-up files in your home directory:

  • .bashrc is "sourced in" every time a bash shell is invoked.
  • .bash_profile applies only to login shells, i.e. when you access the system from outside.

Most of the setup is automatic through usepackage. On login, you have a default setup that is appropriate for a Linux system. Additional packages can be set up by adding commands such as

use anaconda3

to the above setup files, if you want to use the Python 3 distribution "Anaconda" (as an example). Note that this is the same as it was on Solaris, but that the available packages may differ. For a list, use the

use -l

command.

Compiling Code

The standard Fortran/C/C++ compilers differ between the Solaris and the Linux systems. The ones on the x86/Linux platform are discussed here. Here is a comparison in table form. Since there are two compilers (gnu and Intel) on the Linux platform, they are treated separately. The default is gnu. We also list the MPI - related commands for setup, compilation, and runtime.

Fortran/C/C++ Compilers Sparc/Solaris to x86/Linux
Sparc/Solaris x86/Linux (gnu) x86/Linux (Intel)
Name/Version Studio 12.4 Gnu gcc 4.4.7 Intel 12.1
Setup command none (default) none (default) use icsmpi
MPI setup none (default) use openmpi use icsmpi
Fortran / C / C++ compilers f90 / cc / CC gfortran / gcc / g++ ifort / icc / icpc
MPI compoiler wrappers mpif90 / mpicc / mpiCC mpif90 / mpicc / mpicxx mpiifort / mpiicc / mpiicpc
MPI runtime environment mpirun mpirun mpirun

Note that all programs that were running on the Solaris platform have to be re-compiled on Linux. Binaries are not compatible as they are based on different instruction sets.

MPI

On both Solaris and Linux systems, the MPI distribution used is OpenMPI. On the Solaris platform this was integrated with the standard Studio compilers. On the Linux platform, two versions are in use:

  • A stand-alone version of OpenMPI 1.8 is used in combination with the gcc compiler and setup through the use openmpi command.
  • A second version (Intel 4.0 update 3) is used with the Intel compilers and set up together with them ("use icsmpi")

All of these versions use the mpirun command to invoke the runtime environment. Check with which mpirun to see which version you are currently using.

Binary Formats

Important: Some programs use binary format for data I/O. These files are likely not compatible between the two platforms, which means that it may be necessary to re-run the programs on the new platform or convert the data files before using them. This is due the different "Endianness" on the two platforms: Sparc/Solaris is big-Endian and x86/Linux is little-Endian. If you encounter issues with data files, please get in touch with us.

Scheduling

Both the "old" M9000 servers and the "new" SW (Linux) cluster use Sun Grid Engine as a scheduler. Please consult our Scheduler Help File for details about its usage. The following table gives an overview of the alterations that need to be made to a submission script if execution is to take place on the Linux production nodes, i.e. the "SW cluster".

Changes in SGE submissions when migrating from Sparc/Solaris to x86/Linux
Sparc/Solaris x86/Linux
Queue name m9k.q (old default, deprecated) abaqus.q (new default)
Node names m9k000* sw00**, cac0**
Login node for
submission
sflogin0 swlogin1
Rel. Serial Execution Speed 1 3-6
Suggested Relative Nprocs 1 1/2
Queue specification
in submit script
none none
Gaussian Parallel environment
#$ -pe gaussian.pe
#$ -pe glinux.pe
Gaussian Setup line
. /opt/gaussian/setup.sh
. /opt/gaussian/setup.sh

Note that it is strongly suggested to lower the number of processes requested when submitting to the SW cluster. This is because the nodes are substantially smaller than then the M9000 servers, but provide greatly improved per-core performance. This means that even with half the core count, a speedup of 2-3 is likely.

We have added some entries to the table describing modifications that apply only for submissions of jobs running the Computational Chemistry software Gaussian. For more details about this software, please consult our Gaussian Help File. Gaussian submissions go to a dedicated large node on the SW cluster that uses local scratch space to improve performance and avoid bandwidth issues with IO.

Help

If you have questions that you can't resolve by checking documentation, email to cac.help@queensu.ca.