Hardware:SW

From CAC Wiki
Revision as of 18:26, 9 March 2017 by Hasch (Talk | contribs) (The SW (Linux) Cluster)

Jump to: navigation, search

The SW cluster is presently our main compute cluster. Note that we have undergone a major hardware upgrade and that large portions of these pages are subject to change. Please re-visit occasionally to keep abreast of this.

The SW (Linux) Cluster

The Centre for Advanced Computing operates a cluster of X86 based multicore machines running Linux.This page explains essential features of this cluster and is meant as a basic guide for its usage.

SW (Linux) Cluster Nodes ("old" sw series)
Host CPU model Speed Cores Threads Memory
sw0044 Xeon E7-4860 2.3GHz 40 80 256 GB
sw0045 Xeon E7-4860 2.3GHz 40 80 256 GB
sw0046 Xeon E7-4860 2.3GHz 40 80 256 GB
sw0047 Xeon E7-4860 2.3GHz 40 80 256 GB
sw0048 Xeon E7-4860 2.3GHz 40 80 256 GB
sw0049 Xeon E7-4860 2.3GHz 40 80 256 GB
Software (SW) Linux Cluster
Software (SW) Linux Cluster
SW (Linux) Cluster Nodes ("new" cac series)
Host CPU model Speed Cores Threads Memory
cac019 E7-4860 2.3 GHz 40 80 256 GB
cac020 E7-4830 v3 2.1 GHz 48 96 1.2 TB
cac021 E7-4830 v3 2.1 GHz 48 96 1.2 TB
cac022 E7-8860 2.3 GHz 80 160 512 GB
cac023 E7-8860 2.4 GHz 80 160 512 GB
cac024 E7-8860 2.4 GHz 80 160 512 GB
cac025 E7-4860 2.3 GHz 40 80 1 TB
cac026 E7-4860 2.3 GHz 40 80 1 TB
cac027 E7-8850 v2 2.3 GHz 48 96 256 GB
cac028 E7-8867 v3 2.5 GHz 128 256 2 TB
cac028 E7-8867 v3 2.5 GHz 128 256 2 TB
cac029 E7-8867 v3 2.5 GHz 128 256 2 TB
cac030 E7-8867 v3 2.5 GHz 128 256 2 TB
cac032 E7-8867 v3 2.5 GHz 128 256 2 TB
cac033 E7-8867 v3 2.5 GHz 128 256 2 TB
cac034 E5-2650 v4 2.2 GHz 24 256 GB
cac035 E5-2650 v4 2.2 GHz 24 256 GB
cac036 E5-2650 v4 2.2 GHz 24 256 GB
cac037 E5-2650 v4 2.2 GHz 24 256 GB
cac038 E5-2650 v4 2.2 GHz 24 256 GB
cac039 E5-2650 v4 2.2 GHz 24 256 GB
cac040 E5-2650 v4 2.2 GHz 24 256 GB
cac041 E5-2650 v4 2.2 GHz 24 256 GB
cac042 E5-2650 v4 2.2 GHz 24 256 GB
cac043 E5-2650 v4 2.2 GHz 24 256 GB
cac044 E5-2650 v4 2.2 GHz 24 256 GB
cac045 E5-2650 v4 2.2 GHz 24 256 GB
cac046 E5-2650 v4 2.2 GHz 24 256 GB
cac047 E5-2650 v4 2.2 GHz 24 256 GB
cac048 E5-2650 v4 2.2 GHz 24 256 GB
cac049 E5-2650 v4 2.2 GHz 24 256 GB
cac050 E5-2650 v4 2.2 GHz 24 256 GB
cac051 E5-2650 v4 2.2 GHz 24 256 GB
cac052 E5-2650 v4 2.2 GHz 24 256 GB
cac053 E5-2650 v4 2.2 GHz 24 256 GB
cac054 E5-2650 v4 2.2 GHz 24 256 GB
cac055 E5-2650 v4 2.2 GHz 24 256 GB
cac056 E5-2650 v4 2.2 GHz 24 256 GB
cac057 E5-2650 v4 2.2 GHz 24 256 GB
cac058 E5-2650 v4 2.2 GHz 24 256 GB
cac059 E5-2650 v4 2.2 GHz 24 256 GB
cac060 E5-2650 v4 2.2 GHz 24 256 GB
cac061 E5-2650 v4 2.2 GHz 24 256 GB
cac062 E5-2650 v4 2.2 GHz 24 256 GB
cac063 E5-2650 v4 2.2 GHz 24 256 GB
cac064 E5-2650 v4 2.2 GHz 24 256 GB
cac065 E5-2650 v4 2.2 GHz 24 256 GB
cac066 E5-2650 v4 2.2 GHz 24 256 GB
cac067 E5-2650 v4 2.2 GHz 24 256 GB
cac068 E5-2650 v4 2.2 GHz 24 256 GB
cac069 E5-2650 v4 2.2 GHz 24 256 GB
cac070 E5-2650 v4 2.2 GHz 24 256 GB
cac071 E5-2650 v4 2.2 GHz 24 256 GB
cac072 E5-2650 v4 2.2 GHz 24 256 GB
cac073 E5-2650 v4 2.2 GHz 24 256 GB
cac074 E5-2650 v4 2.2 GHz 24 256 GB
cac075 E5-2650 v4 2.2 GHz 24 256 GB
cac076 E5-2650 v4 2.2 GHz 24 256 GB
cac077 E5-2650 v4 2.2 GHz 24 256 GB
cac078 E5-2650 v4 2.2 GHz 24 256 GB
cac079 E5-2650 v4 2.2 GHz 24 256 GB
cac080 E5-2650 v4 2.2 GHz 24 256 GB
cac081 E5-2650 v4 2.2 GHz 24 256 GB
cac082 E5-2650 v4 2.2 GHz 24 256 GB
cac083 E5-2650 v4 2.2 GHz 24 256 GB
cac084 E5-2650 v4 2.2 GHz 24 256 GB
cac085 E5-2650 v4 2.2 GHz 24 256 GB
cac086 E5-2650 v4 2.2 GHz 24 256 GB
cac087 E5-2650 v4 2.2 GHz 24 256 GB
cac088 E5-2650 v4 2.2 GHz 24 256 GB
cac089 E5-2650 v4 2.2 GHz 24 256 GB
cac090 E5-2650 v4 2.2 GHz 24 256 GB
cac091 E5-2650 v4 2.2 GHz 24 256 GB
cac092 E5-2650 v4 2.2 GHz 24 256 GB
cac093 E5-2650 v4 2.2 GHz 24 256 GB
cac094 E5-2650 v4 2.2 GHz 24 256 GB
cac095 E5-2650 v4 2.2 GHz 24 256 GB
cac096 E5-2650 v4 2.2 GHz 24 256 GB
cac097 E5-2650 v4 2.2 GHz 24 256 GB
cac098 E5-2650 v4 2.2 GHz 24 256 GB
cac099 E5-2650 v4 2.2 GHz 24 256 GB

Type of Hardware

This cluster consists of X86 multicore nodes made by Lenovo and IBM. All nodes run CentOS Linux and share a file system. Access is handled by Grid Engine. The server nodes are called cac019...cac099.

  • Presently, the workup node of the "Software Cluster" is swlogin1. This is a Dell PowerEdge R410 Server with 2 sockets with a 6-core Intel® Xeon® processor (Intel x5675) running at 3.1 GHz.
  • Larger high-memory nodes were added at the same time (August 2016). These are of the Lenovo System x3950 x6 8-socket type with 8 x Intel E7-8867 v3 16-core processors at 2.5 GHz for a total of 128 cores (dually hyperthreaded). Each of these units has a total of 2 TB of memory. They are used for special applications that require high memory.
  • Some older nodes are IBM XServers 3850-X5 that are also based on the Intel® Xeon® processor (Intel E7-4860). These servers have a total of 40 cores per node and support for up to 80 threads (hyperthreading). The clock speed for these machines is 2.27GHz. Two of these servers (sw0050-51) have a 1 TB of physical memory, the others have 256 GB.
  • A few of our nodes are IBM Servers based on the Intel E7-8860 processors with 80 cores total (160 threads) running at 2.27 GHz, while another one (sw0053) with 80 cores (160 threads) uses the E7-8870 at 2.4 GHz. Each of these have 512 GB of memory.

Why these Systems?

The main emphasis in these systems is a high floating-point performance for a modest number of processes / threads. Since commercial software such as Fluent and Abaqus offer support for Linux only, this cluster was originally acquired to offer recent versions of these software packages. In addition, the higher single-core performance of these nodes allows for an efficient use of license seats which usually a priced per-core.

Who Should Use This Cluster?

The software cluster runs on the Linux operating system and should be used by anyone who wants to run applications that are available on that platform. Runs that require more than 32 Gbyte of memory need to request this explicitly to avoid mis-scheduling.

We suggest you use this cluster if:

  • Your application is floating-point intensive with modest amounts of memory.
  • Your application is commercial or public-domain software that supports Linux.
  • Your application is explicitly parallel (for instance, using MPI) and has low communication requirements, or is multi-threaded with a small number (typically no more than 12) of scaling threads.
  • Your application uses a commercial license that is scaled per process.

This cluster may not be suitable if:

  • Your application is very memory intensive. Long waiting time may be the consequence.
  • Your application is required to scale to a very large number of processes in a distributed-memory fashion and is communication intensive. Such jobs require a fast interconnect (Infiniband or similar) and should be run on a different system, for instance other Compute Canada installations.

If you think your application could run more efficiently on these machines, please contact us (help@hpcvl.org) to discuss any concerns and let us assist you in getting started.

Note that we have to enforce dedicated cores or CPUs to avoid sharing and context switching overheads. No "overloading" can be allowed.

Using the Cluster

Access

ssh hpcXXXX@130.15.59.64
hpcXXXX@130.15.59.64's password: *****
hpcXXXX@sflogin0$ ssh swlogin1
hpcXXXX@swlogin1's password: ***** 

The file systems for all of our clusters are shared, so you will be using the same home directory as when you are using the M9000 servers or the standard login node sfnode0. swlogin1 can be used for compilation, program development, and testing only, not for production jobs.

Compiling Code

Intel Compiler Suite

The best compiler to use is the Intel Compiler Suite. This includes compilers for Fortran, C, and C++, as well as MPI and OpenMP support, debuggers and development suite. This software resides in /opt/ics. The versions are:

  • Fortran (ifort): Intel(R) Fortran Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 12.1 Build 20110811
  • C (icc): Intel(R) C Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 12.1 Build 20110811
  • C++ (icpc): Intel(R) C++ Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 12.1 Build 20110811

This compiler suite needs to be activated before use. The command is

use icsmpi

Gnu Compilers

In many cases, especially for public domain software, the preferable compiler is gnu C/C++/Fortran. The system version of these is:

Using built-in specs.
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info 
--with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix 
--enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions 
--enable-gnu-unique-object --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk 
--disable-dssi --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre --enable-libgcj-multifile 
--enable-java-maintainer-mode --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib 
--with-ppl --with-cloog --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
Thread model: posix
gcc version 4.4.7 20120313 (Red Hat 4.4.7-11) (GCC)

No special activation is needed to use these, as they reside in a system director. A newer version of this compiler set is available in /opt/gcc-4.8.3 and can be access using the command

use gcc-4.8.3

If MPI is required, it can be loaded through

use openmpi

For applications that cannot be re-compiled (for instance, because the source code is not accessible), a pre-compiled Linux version (x64 for Redhat will do the trick) needs to be obtained.

Running Jobs

As mentioned earlier, program runs for user and application software on the login node are allowed only for test purposes or if interactive use is unavoidable. In the latter case, please get in touch to let us know what you need. Production jobs must be submitted through the Grid Engine load scheduler.

You need to add the following two lines to your script for your job to be scheduled to the Linux SW cluster exclusively:

#$ -q abaqus.q 
#$ -l qname=abaqus.q

The abaqus name for the queue that is added here derives from the initial software Abaqus that was (and still is) run on this cluster.

Note that your jobs will run on dedicated threads, i.e. typically up to 12 processes can be scheduled to a single node. The Grid Engine will do the scheduling, i.e. there is no way for the user to determine which processes run on which cores.

Help?

General information about using HPCVL facilities can be found in our FAQ pages. We also supply user support (please send email to help@hpcvl.org or contact us directly), so if you experience problems, we can assist you.