Revision as of 14:10, 9 September 2016

The SW cluster is presently our main compute cluster. Note that we have undergone a major hardware upgrade and that large portions of these pages are subject to change. Please re-visit occasionally to keep abreast of this.

The SW (Linux) Cluster

Software Linux Cluster

The Centre for Advanced Computing operates a cluster of X86 based multicore machines running Linux.This page explains essential features of this cluster and is meant as a basic guide for its usage.

SW (Linux) Cluster Nodes (sw series)
Host	CPU model	Speed	Cores	Threads	Memory
sw0011	Xeon X5675	3.07GHz	12	24	64 GB
sw0012	Xeon X5675	3.07GHz	12	24	64 GB
sw0013	Xeon X5675	3.07GHz	12	24	64 GB
sw0014	Xeon X5675	3.07GHz	12	24	64 GB
sw0015	Xeon X5675	3.07GHz	12	24	64 GB
sw0016	Xeon X5675	3.07GHz	12	24	64 GB
sw0017	Xeon X5675	3.07GHz	12	24	64 GB
sw0018	Xeon X5675	3.07GHz	12	24	64 GB
sw0019	Xeon X5675	3.07GHz	12	24	64 GB
sw0020	Xeon X5675	3.07GHz	12	24	64 GB
sw0021	Xeon X5675	3.07GHz	12	24	64 GB
sw0022	Xeon X5675	3.07GHz	12	24	64 GB
sw0023	Xeon X5675	3.07GHz	12	24	32 GB
sw0024	Xeon X5675	3.07GHz	12	24	32 GB
sw0025	Xeon X5675	3.07GHz	12	24	32 GB
sw0026	Xeon X5675	3.07GHz	12	24	32 GB
sw0027	Xeon X5675	3.07GHz	12	24	32 GB
sw0028	Xeon X5675	3.07GHz	12	24	32 GB
sw0029	Xeon X5675	3.07GHz	12	24	32 GB
sw0030	Xeon X5675	3.07GHz	12	24	32 GB
sw0031	Xeon X5675	3.07GHz	12	24	32 GB
sw0032	Xeon X5675	3.07GHz	12	24	32 GB
sw0033	Xeon X5675	3.07GHz	12	24	32 GB
sw0034	Xeon X5675	3.07GHz	12	24	32 GB
sw0035	Xeon X5670	2.93GHz	12	24	64 GB
sw0036	Xeon X5670	2.93GHz	12	24	64 GB
sw0037	Xeon X5670	2.93GHz	12	24	64 GB
sw0038	Xeon X5670	2.93GHz	12	24	64 GB
sw0039	Xeon X5670	2.93GHz	12	24	64 GB
sw0040	Xeon X5670	2.93GHz	12	24	64 GB
sw0041	Xeon E7- 4860	2.27GHz	40	80	256 GB
sw0042	Xeon E7- 4860	2.27GHz	40	80	256 GB
sw0043	Xeon E7- 4860	2.27GHz	40	80	256 GB
sw0044	Xeon E7- 4860	2.27GHz	40	80	256 GB
sw0045	Xeon E7- 4860	2.27GHz	40	80	256 GB
sw0046	Xeon E7- 4860	2.27GHz	40	80	256 GB
sw0047	Xeon E7- 4860	2.27GHz	40	80	256 GB
sw0048	Xeon E7- 4860	2.27GHz	40	80	256 GB
sw0049	Xeon E7- 4860	2.27GHz	40	80	256 GB
sw0050	Xeon E7- 4860	2.27GHz	40	80	1 TB
sw0051	Xeon E7- 4860	2.27GHz	40	80	1 TB
sw0052	Xeon E7- 8860	2.27GHz	80	160	512 GB
sw0053	Xeon E7- 8870	2.40GHz	80	160	512 GB
sw0054	Xeon E7- 8860	2.27GHz	80	160	512 GB
sw0058	Xeon E7 - 8860	2.27GHz	48	96	128 GB
sw0059	Xeon E7 - 8860	2.27GHz	48	96	128 GB

SW (Linux) Cluster Nodes (cac series)
Host	CPU model	Speed	Cores	Threads	Memory
cac011	E5-2650	2.2 GHz	24	48	256 GB
cac012	E5-2650	2.2 GHz	24	48	256 GB
cac013	E5-2650	2.2 GHz	24	48	256 GB
cac014	E5-2650	2.2 GHz	24	48	256 GB
cac015	E5-2650	2.2 GHz	24	48	256 GB
cac016	E5-2650	2.2 GHz	24	48	256 GB
cac017	E5-2650	2.2 GHz	24	48	256 GB
cac018	E5-2650	2.2 GHz	24	48	256 GB
cac019	E5-2650	2.2 GHz	24	48	256 GB
cac020	E5-2650	2.2 GHz	24	48	256 GB
cac021	E5-2650	2.2 GHz	24	48	256 GB
cac022	E5-2650	2.2 GHz	24	48	256 GB
cac023	E5-2650	2.2 GHz	24	48	256 GB
cac024	E5-2650	2.2 GHz	24	48	256 GB
cac025	E5-2650	2.2 GHz	24	48	256 GB
cac026	E5-2650	2.2 GHz	24	48	256 GB
cac027	E5-2650	2.2 GHz	24	48	256 GB
cac028	E7-8867	2.5 GHz	128	256	2 TB
cac028	E7-8867	2.5 GHz	128	256	2 TB
cac029	E7-8867	2.5 GHz	128	256	2 TB
cac030	E7-8867	2.5 GHz	128	256	2 TB
cac032	E7-8867	2.5 GHz	128	256	2 TB
cac033	E7-8867	2.5 GHz	128	256	2 TB
cac034	E5-2650	2.2 GHz	24	48	256 GB
cac035	E5-2650	2.2 GHz	24	48	256 GB
cac036	E5-2650	2.2 GHz	24	48	256 GB
cac037	E5-2650	2.2 GHz	24	48	256 GB
cac038	E5-2650	2.2 GHz	24	48	256 GB
cac039	E5-2650	2.2 GHz	24	48	256 GB
cac040	E5-2650	2.2 GHz	24	48	256 GB
cac041	E5-2650	2.2 GHz	24	48	256 GB
cac042	E5-2650	2.2 GHz	24	48	256 GB
cac043	E5-2650	2.2 GHz	24	48	256 GB
cac044	E5-2650	2.2 GHz	24	48	256 GB
cac045	E5-2650	2.2 GHz	24	48	256 GB
cac046	E5-2650	2.2 GHz	24	48	256 GB
cac047	E5-2650	2.2 GHz	24	48	256 GB
cac048	E5-2650	2.2 GHz	24	48	256 GB
cac049	E5-2650	2.2 GHz	24	48	256 GB
cac050	E5-2650	2.2 GHz	24	48	256 GB
cac051	E5-2650	2.2 GHz	24	48	256 GB
cac052	E5-2650	2.2 GHz	24	48	256 GB
cac053	E5-2650	2.2 GHz	24	48	256 GB
cac054	E5-2650	2.2 GHz	24	48	256 GB
cac055	E5-2650	2.2 GHz	24	48	256 GB
cac056	E5-2650	2.2 GHz	24	48	256 GB
cac057	E5-2650	2.2 GHz	24	48	256 GB
cac058	E5-2650	2.2 GHz	24	48	256 GB
cac059	E5-2650	2.2 GHz	24	48	256 GB
cac060	E5-2650	2.2 GHz	24	48	256 GB
cac061	E5-2650	2.2 GHz	24	48	256 GB
cac062	E5-2650	2.2 GHz	24	48	256 GB
cac063	E5-2650	2.2 GHz	24	48	256 GB
cac064	E5-2650	2.2 GHz	24	48	256 GB
cac065	E5-2650	2.2 GHz	24	48	256 GB
cac066	E5-2650	2.2 GHz	24	48	256 GB
cac067	E5-2650	2.2 GHz	24	48	256 GB
cac068	E5-2650	2.2 GHz	24	48	256 GB
cac069	E5-2650	2.2 GHz	24	48	256 GB

Type of Hardware

This cluster consists of X86 multicore nodes made by Dell, IBM, and Lenovo. All nodes run CentOS Linux and share a file system. Access is handled by Grid Engine. The server nodes are called sw0004...sw0059 and cac011...cac069.

Presently, the workup node of the HPCVL "Software Cluster" is swlogin1. This is a Dell PowerEdge R410 Server with 2 sockets with a 6-core Intel® Xeon® processor (Intel x5675) running at 3.1 GHz.

Some of the nodes in the SW cluster are Dell PowerEdge R410 Servers that have 2 sockets with a 6-core Intel Xeon processor (Intel x5670 / x5675) that runs at 2.9/3.07 GHz. These nodes offer a total of 12 cores that are 2-fold hyperthreaded, i.e. they support up to 24 threads. The scheduler is configured such that only 12 threads are run at a time. These nodes have 64 Gbyte (sw0015-22, sw0035-40) or 32 Gbyte (sw0023-34) of physical memory.

Some nodes are IBM XServers 3850-X5 that are also based on the Intel® Xeon® processor (Intel E7-4860). These servers have a total of 40 cores per node and support for up to 80 threads (hyperthreading). The clock speed for these machines is 2.27GHz. Two of these servers (sw0050-51) have a 1 TB of physical memory, the others have 256 GB.

Two of our nodes are IBM Servers based on the Intel E7-8860 processors with 80 cores total (160 threads) running at 2.27 GHz, while another one (sw0053) with 80 cores (160 threads) uses the E7-8870 at 2.4 GHz. Each of the three have 512 GB of memory.

A recent (August 2016) upgrade added units of Lenovo NeXtScale nx360 M5 nodes. These are based on 2 Intel Xeon E5-2650 12-core CPUs that run at 2.2 GHz, for a total of 24 cores per node.

Larger high-memory nodes were added at the same time (August 2016). These are of the Lenovo System x3950 x6 8-socket type with 8 x Intel E7-8867 v3 16-core processors at 2.5 GHz for a total of 128 cores (dually hyperthreaded). Each of these units has a total of 2 TB of memory. They are used for special applications that require high memory.

Why these Systems?

The main emphasis in these systems is a high floating-point performance for a modest number of processes / threads. Since commercial software such as Fluent and Abaqus offer support for Linux only, this cluster was originally acquired to offer recent versions of these software packages. In addition, the higher single-core performance of these nodes allows for an efficient use of license seats which usually a priced per-core.

Who Should Use This Cluster?

The software cluster runs on the Linux operating system and should be used by anyone who wants to run applications that are available on that platform. Runs that require more than 32 Gbyte of memory need to request this explicitly to avoid mis-scheduling.

We suggest you use this cluster if:

Your application is floating-point intensive with modest amounts of memory.

Your application is commercial or public-domain software that supports Linux.

Your application is explicitly parallel (for instance, using MPI) and has low communication requirements, or is multi-threaded with a small number (typically no more than 12) of scaling threads.

Your application uses a commercial license that is scaled per process.

This cluster may not be suitable if:

Your application is very memory intensive. Long waiting time may be the consequence.

Your application is required to scale to a very large number of processes in a distributed-memory fashion and is communication intensive. Such jobs require a fast interconnect (Infiniband or similar) and should be run on a different system, for instance other Compute Canada installations.

If you think your application could run more efficiently on these machines, please contact us (help@hpcvl.org) to discuss any concerns and let us assist you in getting started.

Note that we have to enforce dedicated cores or CPUs to avoid sharing and context switching overheads. No "overloading" can be allowed.

Using the Cluster

Access

Directly through the xterm (linux login node) application from the Secure Global Desktop (portal).
Indirectly through ssh from sflogin0:

ssh hpcXXXX@130.15.59.64
hpcXXXX@130.15.59.64's password: *****
hpcXXXX@sflogin0$ ssh swlogin1
hpcXXXX@swlogin1's password: *****

The file systems for all of our clusters are shared, so you will be using the same home directory as when you are using the M9000 servers or the standard login node sfnode0. swlogin1 can be used for compilation, program development, and testing only, not for production jobs.

Compiling Code

Intel Compiler Suite

The best compiler to use is the Intel Compiler Suite. This includes compilers for Fortran, C, and C++, as well as MPI and OpenMP support, debuggers and development suite. This software resides in /opt/ics. The versions are:

Fortran (ifort): Intel(R) Fortran Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 12.1 Build 20110811
C (icc): Intel(R) C Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 12.1 Build 20110811
C++ (icpc): Intel(R) C++ Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 12.1 Build 20110811

This compiler suite needs to be activated before use. The command is

use icsmpi

Gnu Compilers

In many cases, especially for public domain software, the preferable compiler is gnu C/C++/Fortran. The system version of these is:

Using built-in specs.
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info 
--with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix 
--enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions 
--enable-gnu-unique-object --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk 
--disable-dssi --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre --enable-libgcj-multifile 
--enable-java-maintainer-mode --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib 
--with-ppl --with-cloog --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
Thread model: posix
gcc version 4.4.7 20120313 (Red Hat 4.4.7-11) (GCC)

No special activation is needed to use these, as they reside in a system director. A newer version of this compiler set is available in /opt/gcc-4.8.3 and can be access using the command

use gcc-4.8.3

If MPI is required, it can be loaded through

use openmpi

For applications that cannot be re-compiled (for instance, because the source code is not accessible), a pre-compiled Linux version (x64 for Redhat will do the trick) needs to be obtained.

Running Jobs

As mentioned earlier, program runs for user and application software on the login node are allowed only for test purposes or if interactive use is unavoidable. In the latter case, please get in touch to let us know what you need. Production jobs must be submitted through the Grid Engine load scheduler.

You need to add the following two lines to your script for your job to be scheduled to the Linux SW cluster exclusively:

#$ -q abaqus.q 
#$ -l qname=abaqus.q

The abaqus name for the queue that is added here derives from the initial software Abaqus that was (and still is) run on this cluster.

Note that your jobs will run on dedicated threads, i.e. typically up to 12 processes can be scheduled to a single node. The Grid Engine will do the scheduling, i.e. there is no way for the user to determine which processes run on which cores.

Help?

General information about using HPCVL facilities can be found in our FAQ pages. We also supply user support (please send email to help@hpcvl.org or contact us directly), so if you experience problems, we can assist you.

@@ Line 775: / Line 775: @@
 * Presently, the workup node of the HPCVL "Software Cluster" is '''swlogin1'''. This is a [http://www.dell.com/downloads/emea/products/R410_spec_sheet.pdf Dell PowerEdge R410 Server] with 2 sockets with a 6-core Intel® Xeon® processor (Intel x5675) running at 3.1 GHz.
-* Some of the nodes in the SW cluster (sw0015-40) are [http://www.dell.com/downloads/emea/products/R410_spec_sheet.pdf Dell PowerEdge R410] Servers that have 2 sockets with a 6-core Intel Xeon processor ([http://ark.intel.com/products/47920/Intel-Xeon-Processor-X5670-12M-Cache-2_93-GHz-6_40-GTs-Intel-QPI Intel x5670] / [http://ark.intel.com/products/52577/Intel-Xeon-Processor-X5675-12M-Cache-3_06-GHz-6_40-GTs-Intel-QPI x5675]) that runs at 2.9/3.07 GHz. These nodes offer a total of 12 cores that are 2-fold hyperthreaded, i.e. they support up to 24 threads. The scheduler is configured such that only 12 threads are run at a time. These nodes have 64 Gbyte (sw0015-22, sw0035-40) or 32 Gbyte (sw0023-34) of physical memory.
+* Some of the nodes in the SW cluster are [http://www.dell.com/downloads/emea/products/R410_spec_sheet.pdf Dell PowerEdge R410] Servers that have 2 sockets with a 6-core Intel Xeon processor ([http://ark.intel.com/products/47920/Intel-Xeon-Processor-X5670-12M-Cache-2_93-GHz-6_40-GTs-Intel-QPI Intel x5670] / [http://ark.intel.com/products/52577/Intel-Xeon-Processor-X5675-12M-Cache-3_06-GHz-6_40-GTs-Intel-QPI x5675]) that runs at 2.9/3.07 GHz. These nodes offer a total of 12 cores that are 2-fold hyperthreaded, i.e. they support up to 24 threads. The scheduler is configured such that only 12 threads are run at a time. These nodes have 64 Gbyte (sw0015-22, sw0035-40) or 32 Gbyte (sw0023-34) of physical memory.
-* Some nodes (sw0041-51) are [https://lenovopress.com/tips0817 '''IBM XServers 3850-X5'''] that are also based on the Intel® Xeon® processor ([http://ark.intel.com/products/53571/Intel-Xeon-Processor-E7-4860-24M-Cache-2_26-GHz-6_40-GTs-Intel-QPI Intel E7-4860]). These servers have a total of 40 cores per node and support for up to 80 threads (hyperthreading). The clock speed for these machines is 2.27GHz. Two of these servers (sw0050-51) have a 1 TB of physical memory, the others have 256 GB.
+* Some nodes are [https://lenovopress.com/tips0817 '''IBM XServers 3850-X5'''] that are also based on the Intel® Xeon® processor ([http://ark.intel.com/products/53571/Intel-Xeon-Processor-E7-4860-24M-Cache-2_26-GHz-6_40-GTs-Intel-QPI Intel E7-4860]). These servers have a total of 40 cores per node and support for up to 80 threads (hyperthreading). The clock speed for these machines is 2.27GHz. Two of these servers (sw0050-51) have a 1 TB of physical memory, the others have 256 GB.
-* Two of our nodes (sw0052,sw0054) are '''IBM Servers''' based on the [http://ark.intel.com/products/53572/Intel-Xeon-Processor-E7-8860-24M-Cache-2_26-GHz-6_40-GTs-Intel-QPI Intel E7-8860] processors with 80 cores total (160 threads) running at 2.27 GHz, while another one (sw0053) with 80 cores (160 threads) uses the E7-8870 at 2.4 GHz. Each of the three have 512 GB of memory.
+* Two of our nodes are '''IBM Servers''' based on the [http://ark.intel.com/products/53572/Intel-Xeon-Processor-E7-8860-24M-Cache-2_26-GHz-6_40-GTs-Intel-QPI Intel E7-8860] processors with 80 cores total (160 threads) running at 2.27 GHz, while another one (sw0053) with 80 cores (160 threads) uses the E7-8870 at 2.4 GHz. Each of the three have 512 GB of memory.
-* A recent (August 2016) upgrade added 53 units (cac011...27, cac034...69) of [https://lenovopress.com/tips1195-nextscale-nx360-m5-e5-2600-v3 Lenovo NeXtScale nx360 M5 nodes]. These are based on 2 [http://ark.intel.com/products/91767/Intel-Xeon-Processor-E5-2650-v4-30M-Cache-2_20-GHz Intel Xeon E5-2650 12-core CPUs] that run at 2.2 GHz, for a total of 24 cores per node.
+* A recent (August 2016) upgrade added units of [https://lenovopress.com/tips1195-nextscale-nx360-m5-e5-2600-v3 Lenovo NeXtScale nx360 M5 nodes]. These are based on 2 [http://ark.intel.com/products/91767/Intel-Xeon-Processor-E5-2650-v4-30M-Cache-2_20-GHz Intel Xeon E5-2650 12-core CPUs] that run at 2.2 GHz, for a total of 24 cores per node.
-* A recent (August 2016) upgrade added 53 units (cac011...27, cac034...69) of [https://lenovopress.com/tips1195-nextscale-nx360-m5-e5-2600-v3 Lenovo NeXtScale nx360 M5 nodes]. These are based on 2 [http://ark.intel.com/products/91767/Intel-Xeon-Processor-E5-2650-v4-30M-Cache-2_20-GHz Intel Xeon E5-2650 12-core CPUs] that run at 2.2 GHz, for a total of 24 cores per node.
+* Larger high-memory nodes were added at the same time (August 2016). These are of the [http://www.lenovo.com/images/products/system-x/pdfs/datasheets/x3950_x6_ds.pdf Lenovo System x3950 x6] 8-socket type with 8 x [http://ark.intel.com/products/84681/Intel-Xeon-Processor-E7-8867-v3-45M-Cache-2_50-GHz Intel E7-8867 v3] 16-core processors at 2.5 GHz for a total of 128 cores (dually hyperthreaded). Each of these units has a total of 2 TB of memory. They are used for special applications that require high memory.
-* A further 5 nodes (cac028...33) were added at the same time (August 2016). These are of the [http://www.lenovo.com/images/products/system-x/pdfs/datasheets/x3950_x6_ds.pdf Lenovo System x3950 x6] 8-socket type with 8 x [http://ark.intel.com/products/84681/Intel-Xeon-Processor-E7-8867-v3-45M-Cache-2_50-GHz Intel E7-8867 v3] 16-core processors at 2.5 GHz for a total of 128 cores (dually hyperthreaded). Each of these units has a total of 2 TB of memory. They are used for special applications that require high memory.
 ==Why these Systems?==
@@ Line 805: / Line 803: @@
 * Your application uses a commercial license that is scaled per process.
-'''This cluster might not be suitable if'''
+This cluster may not be suitable if:
-* You need to perform a large number of relatively short jobs, each serial or with very few threads.
 * Your application is very memory intensive. Long waiting time may be the consequence.

Difference between revisions of "Hardware:SW"

Revision as of 14:10, 9 September 2016

Contents

The SW (Linux) Cluster

Type of Hardware

Why these Systems?

Who Should Use This Cluster?

Using the Cluster

Access

Compiling Code

Intel Compiler Suite

Gnu Compilers

Running Jobs

Help?

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools