Difference between revisions of "HowTo:comsol"

From CAC Wiki
Jump to: navigation, search
(License troubleshooting)
 
(16 intermediate revisions by 2 users not shown)
Line 1: Line 1:
Although COMSOL is installed on the SW cluster, users will need to provide their own license before they are able to use the software.
+
'''Note: This page no longer valid. We will try to update it to reflect the new CMC licensing scheme.'''
  
Any given COMSOL job requires 4 components to run, namely a license key (follows the format "CMC_#####.key"), a CMC_CADPASS_BATCH_COMSOL.sh licensing script, a job script, and an input .mph file. The CMC license key and CMC_CADPASS_BATCH_COMSOL.sh script should be obtained from CMC. If CMC asks for an IP address of the machine COMSOL will be run on, give them 130.15.59.4 (this is the external IP address of the SW cluster).
+
Although COMSOL is installed on our clusters, users will need to provide their own license before they are able to use the software.
 +
 
 +
Any given COMSOL job requires 4 components to run, namely a license key (follows the format "CMC_#####.key"), a CMC_CADPASS_BATCH_COMSOL.sh licensing script, a job script, and an input .mph file. The CMC license key and CMC_CADPASS_BATCH_COMSOL.sh script should be obtained from CMC. If CMC asks for an IP address of the machine COMSOL will be run on, give them 130.15.59.6 (this is the external IP address of the Frontenac cluster).
  
 
== Using your own license ==
 
== Using your own license ==
Line 34: Line 36:
  
 
<pre>
 
<pre>
mkdir -p /scratch/hpc1234/comsol_scratch
+
mkdir -p /global/scratch/hpc1234/comsol_scratch
mv ~/.comsol/* /scratch/hpc1234/comsol_scratch            # moves any existing COMSOL tempfiles to your new scratch directory
+
mv ~/.comsol/* /global/scratch/hpc1234/comsol_scratch            # moves any existing COMSOL tempfiles to your new scratch directory
rmdir ~/.comsol                                           # if the .comsol directory exists, delete it
+
rmdir ~/.comsol                                                   # if the .comsol directory exists, delete it
ln -s /scratch/hpc1234/comsol_scratch ~/.comsol
+
ln -s /global/scratch/hpc1234/comsol_scratch ~/.comsol
 
</pre>
 
</pre>
  
Line 46: Line 48:
 
<pre>
 
<pre>
 
#!/bin/bash
 
#!/bin/bash
#$ -S /bin/bash
+
#SBATCH --job-name=COMSOL_job
#$ -V
+
#SBATCH --mail-type=ALL
#$ -cwd
+
#SBATCH --mail-user=myEmail@dress.ca
#$ -q abaqus.q
+
#SBATCH -o COMSOL-job.out
#$ -l qname=abaqus.q
+
#SBATCH -e COMSOL-job.err
#$ -o comsol_job.o$JOB_ID
+
#SBATCH -N 1
#$ -j y
+
#SBATCH -n 1
 
+
#SBATCH -c 12
# Change the 12 to however many processors are needed
+
#SBATCH -t 30:00
#$ -pe shm.pe 12
+
#SBATCH --mem=24000
# Change the 24 to however much memory is needed
+
module load comsol
#$ -l mf=24G
+
 
+
use java8
+
use comsol
+
 
(while true; do sleep 60 ; done) | ./CMC_CADPASS_BATCH_COMSOL.sh <Your_UID> &
 
(while true; do sleep 60 ; done) | ./CMC_CADPASS_BATCH_COMSOL.sh <Your_UID> &
 
sleep 30
 
sleep 30
comsol -clustersimple batch -tmpdir /scratch/hpc1234/comsol_scratch -inputfile inputFilename.mph -outputfile outputFilename
+
comsol -np $SLURM_CPUS_PER_TASK batch -tmpdir $TMPDIR -inputfile inputFilename.mph -outputfile outputFilename
 
</pre>
 
</pre>
  
Once done creating this job script, submit the job with "qsub yourJobScriptName.sh". If you've reached this point, congratulations! You can now run COMSOL jobs on the SW cluster.
+
Of course, this script needs to be modified to fit the individual case. Specify your email address with the --mail-user option to get notified when a job starts and finishes. The -o and -e options serve to re-direct the screeen output from COMSOL and the system, respectively. The -N and -n option specify node and process numbers and need to stay at 1 for COMSOL. -c specifies the number of cores to be allocated and sets the environment variable SLURM_CPUS_PER_TASK which is passed to COMSOL.
  
== License troubleshooting ==
+
Please consult our [short guide to the SLURM scheduler] about how to run jobs using SLURM.
  
My license only works for a single node! How can I schedule multiple jobs to one machine?
+
Once done creating this job script, submit the job with "sbatch yourJobScriptName.sh". If you've reached this point, congratulations! You can now run COMSOL jobs on Frontenac.
 
+
Add the following line to your job script. Keep in mind that this may make your jobs considerably more difficult to schedule, as they can only be scheduled on a single node. It is a very good idea to get in touch with CAC user support before doing this, as it might otherwise be extremely difficult to pick a suitable node.
+
<pre>
+
#$ -l hostname=nodeName
+
</pre>
+
  
 +
== License troubleshooting ==
  
 
<pre>
 
<pre>
Line 84: Line 78:
  
 
Check that you have registered and activated your hpc#### username with CMC.
 
Check that you have registered and activated your hpc#### username with CMC.
 +
 +
 +
----
  
  
Line 93: Line 90:
  
 
All of the available seats for your license have been checked out and are in use.
 
All of the available seats for your license have been checked out and are in use.
 +
 +
 +
----
 +
 +
 +
<pre>
 +
A start error occured on node 0: License_error_-15_Cannot_connect_to_license_server_system
 +
</pre>
 +
 +
The CMC_CADPASS_BATCH_COMSOL.sh licensing script will silently fail (and not connect your job to the server) unless the CMC_#####.key file is in the working directory of the job. To change this behavior, edit CMC_CADPASS_BATCH_COMSOL.sh script to point directly to the location of your key. In this case, change the line beginning with CMC_KEY to the following:
 +
 +
<pre>CMC_KEY="/absolute/path/to/your/CMC_"$CMC_UID".key"</pre>
 +
 +
 +
----
 +
 +
 +
My license only works for a single node! How can I schedule multiple jobs to one machine?
 +
 +
(... to be added ...)

Latest revision as of 18:19, 24 March 2022

Note: This page no longer valid. We will try to update it to reflect the new CMC licensing scheme.

Although COMSOL is installed on our clusters, users will need to provide their own license before they are able to use the software.

Any given COMSOL job requires 4 components to run, namely a license key (follows the format "CMC_#####.key"), a CMC_CADPASS_BATCH_COMSOL.sh licensing script, a job script, and an input .mph file. The CMC license key and CMC_CADPASS_BATCH_COMSOL.sh script should be obtained from CMC. If CMC asks for an IP address of the machine COMSOL will be run on, give them 130.15.59.6 (this is the external IP address of the Frontenac cluster).

Using your own license

The CMC_CADPASS_BATCH_COMSOL.sh script requires editing before it can be used. These changes are designed to allow your job scripts to be run non-interactively. Please edit the line beginning with "ssh" and replace it with the following:

ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -p 443 -i $CMC_KEY -L 6601:lmserver-8:6601 -L 16601:lmserver-8:16601 cpass01.cmc.ca -l $CMC_AG

As a test, verify that your license is active. To do this, run CMC_CADPASS_BATCH_COMSOL.sh with the command "./CMC_CADPASS_BATCH_COMSOL.sh <yourUID>". This will open up a connection with CMC and fetch your license status. Your UID is the numeric portion of your COMSOL key. As an example, the UID of "CMC_12345.key" would be "12345".

An active license will show the following output (type "quit" to quit):

Warning: Permanently added '[cpass01.cmc.ca]:443,[130.15.52.80]:443' (RSA) to the list of known hosts.

IP access services                                            Status
 CMC_COMSOL_lmgrd  CMC COMSOL lmgrd (CMC Data center)           Active
 CMC_COMSOL_vendor  CMC COMSOL vendor                            Active

Enter a service name above, or 'help' for further instructions.

appgate>

If you see dashes ("-") instead of "Active", you should get in touch with CMC and ensure your license gets activated. COMSOL jobs will fail with a licensing error until this is resolved.

Redirect temporary files

COMSOL will attempt to place a large number of temporary and configuration files in your home directory and /tmp (several GB per run). This is not recommended on compute clusters, as /tmp is not shared between nodes and can fill up quickly (causing COMSOL runs to crash with a disk error), and the files it places in one's home directory may use up a significant amount of a user's disk quota under /home. To avoid this, we suggest redirecting all COMSOL tempfiles to your scratch directory. Follow these directions to setup temp file redirection (replace hpc1234 with your user name):

mkdir -p /global/scratch/hpc1234/comsol_scratch
mv ~/.comsol/* /global/scratch/hpc1234/comsol_scratch             # moves any existing COMSOL tempfiles to your new scratch directory
rmdir ~/.comsol                                                   # if the .comsol directory exists, delete it
ln -s /global/scratch/hpc1234/comsol_scratch ~/.comsol

Running jobs

Assuming all required files are in the same directory, a typical COMSOL job might look like this (replace <Your_UID> with your UID number):

#!/bin/bash
#SBATCH --job-name=COMSOL_job
#SBATCH --mail-type=ALL
#SBATCH --mail-user=myEmail@dress.ca
#SBATCH -o COMSOL-job.out
#SBATCH -e COMSOL-job.err
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 12
#SBATCH -t 30:00
#SBATCH --mem=24000
module load comsol
(while true; do sleep 60 ; done) | ./CMC_CADPASS_BATCH_COMSOL.sh <Your_UID> &
sleep 30
comsol -np $SLURM_CPUS_PER_TASK batch -tmpdir $TMPDIR -inputfile inputFilename.mph -outputfile outputFilename

Of course, this script needs to be modified to fit the individual case. Specify your email address with the --mail-user option to get notified when a job starts and finishes. The -o and -e options serve to re-direct the screeen output from COMSOL and the system, respectively. The -N and -n option specify node and process numbers and need to stay at 1 for COMSOL. -c specifies the number of cores to be allocated and sets the environment variable SLURM_CPUS_PER_TASK which is passed to COMSOL.

Please consult our [short guide to the SLURM scheduler] about how to run jobs using SLURM.

Once done creating this job script, submit the job with "sbatch yourJobScriptName.sh". If you've reached this point, congratulations! You can now run COMSOL jobs on Frontenac.

License troubleshooting

A start error occured on node 11: Could_not_obtain_license_for#Cluster Node
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 11

Check that you have registered and activated your hpc#### username with CMC.




Could not obtain license for COMSOL Multiphysics.
License error -5. 
No such product exists.

All of the available seats for your license have been checked out and are in use.




A start error occured on node 0: License_error_-15_Cannot_connect_to_license_server_system

The CMC_CADPASS_BATCH_COMSOL.sh licensing script will silently fail (and not connect your job to the server) unless the CMC_#####.key file is in the working directory of the job. To change this behavior, edit CMC_CADPASS_BATCH_COMSOL.sh script to point directly to the location of your key. In this case, change the line beginning with CMC_KEY to the following:

CMC_KEY="/absolute/path/to/your/CMC_"$CMC_UID".key"




My license only works for a single node! How can I schedule multiple jobs to one machine?

(... to be added ...)