Difference between revisions of "HowTo:gamess"

From CAC Wiki
Jump to: navigation, search
(Created page with "= Gamess (US) = This is an introduction to the usage of the electronic-structure code "Gamess" on the HPCVL clusters. It is meant as an initial pointer to more detailed infor...")
 
(Submitting (parallel) Gamess jobs)
 
(20 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
= Gamess (US) =
 
= Gamess (US) =
  
This is an introduction to the usage of the electronic-structure code "Gamess" on the HPCVL clusters. It is meant as an initial pointer to more detailed information, and to get started. It does not replace study of the manual.
+
This is an introduction to the usage of the electronic-structure code "Gamess" on our clusters. It is meant as an initial pointer to more detailed information, and to get started. It does not replace study of the manual.
 
{|  style="border-spacing: 8px;"
 
{|  style="border-spacing: 8px;"
 
| valign="top" width="50%" style="padding:1em; border:1px solid #aaaaaa; background-color:#e1eaf1; border-radius:7px" |  
 
| valign="top" width="50%" style="padding:1em; border:1px solid #aaaaaa; background-color:#e1eaf1; border-radius:7px" |  
Line 28: Line 28:
 
== Location of the program and setup ==
 
== Location of the program and setup ==
  
The program resides in '''/opt/gamess''' and is called '''gamess.01.x'''. You also find some test examples in the program directory, which are useful to get an idea of the input format for the program. You are '''not''' allowed to copy the executable or any part of the distribution onto your local machine. However you can easily obtain the program yourself. See the [http://www.msg.ameslab.gov/gamess/download.html GAMESS source code distribution page].
+
The program resides on the so-called CVMFS stack (provided by Compute Canada, contains most of our application software). You also find some test examples in the program directory, which are useful to get an idea of the input format for the program. You are '''not''' allowed to copy the executable or any part of the distribution onto your local machine. However you can easily obtain the program yourself. See the [http://www.msg.ameslab.gov/gamess/download.html GAMESS source code distribution page].
  
Unlike other programs, no special setup is needed to run Gamess. All environment variables etc. are set with an execution script that will be described in the next section. However, it is a good idea to put the directory with the Gamess program into the path, i.e. set the PATH environment variable. This is best done through the usepackage utility, simply by typing <pre>use gamess </pre>
+
Setup is done via the standard module load:
 +
 
 +
<pre>
 +
module load gamess-us
 +
</pre>
  
 
== Scratch files ==
 
== Scratch files ==
  
One of the settings is the environment variable '''GAUSS_SCRDIR''' which is required to redirect the temporary files that Gaussian uses to the proper scratch space, presently
+
When you run Gamess a '''scratch space directory''' is set to '''/scratch/$USER''' and all temporary files and intermediate output will be placed in that directory. The program also requires a '''local temporary directory''' right below your home directory called '''$USER/scr'''. Make sure that you move files that you want to keep from there before running Gamess again with the same case_name. The second run requires that some temporary files be removed before re-run and will fail if they are still present in the scratch directory. For instance, if your username is "hpc9876", you will need "/scratch/hpc9876" and "/home/hpc9876/scr". Note that the former is automatically created when you obtain an account, but the latter has to be made by you explicitly.
 
+
<pre>export GAUSS_SCRDIR=/scratch/hpcXXXX</pre>
+
 
+
where hpcXXXX stands for your username. If for some reason Gaussian does not terminate normally (e.g. a job gets cancelled), it leaves behind large '''scratch files''' which you may have to delete manually. To check if such files exist, type
+
 
+
<pre>ls -lt /scratch/hpcXXXX</pre>
+
 
+
Once you have determined that the scratch files are no longer needed (because the program that used them is not running any more), you can delete them by typing
+
 
+
<pre>rm /scratch/hpcxxxx/Gau-*</pre>
+
 
+
Cleaning up the scratch space is the user's responsibility. If it is not done regularly, it can cause jobs to terminate, and much work to be lost.
+
 
|}
 
|}
 
{|  style="border-spacing: 8px;"
 
{|  style="border-spacing: 8px;"
| valign="top" width="50%" style="padding:1em; border:1px solid #aaaaaa; background-color:#f7f7f7; border-radius:7px" |  
+
| valign="top" width="50%" style="padding:1em; border:1px solid #aaaaaa; background-color:#f7f7f7; border-radius:7px" |
== Running Gaussian from a command line==
+
  
To run Gaussian on our systems, you have to belong to a '''user group g98''' (it's called that for historical reason, but it applies to all versions of Gaussian). You need to read our license agreement and [http://www.hpcvl.org/sites/default/files/gaussian-statement.pdf signed a statement] to be included in this user group. Once you are, you can access the executables.
+
== Running Gamess from a command line==
  
A computation is performed by preparing an input file and pipe it to standard input of the program '''g09'''. Standard output should be caught in a log-file. We suggest you use the '''extensions''' ''.g09 for input'' files and ''.log for results''.
+
To run Gamess, a script file '''rungms''' is used. This file resides in the same directory as the '''gamess.00.x''' executable. Assuming that the home of the script file and executable is in your path, all you need to do is type
 +
<pre>rungms case_name 01 n_procs</pre>
 +
where case_name is the name of the input file (file extension is assumed to be .inp and must not be specified), and n_procs stands for the number of processes to be used in a parallel Gamess run. If n_procs=1 a serial run will be performed.  
  
Interactively, the command line to invoke Gaussian is thus:
+
'''Note:''' It is absolutely essential to have a good idea about the size and complexity of your calculations before you start a Gamess job. Many of the methods have terrible scaling properties, i.e. the computational cost grows very quickly with the number of electrons, degrees of freedom, or number of basis functions used. We suggest you start with a small basis set and a cheap method, and then slowly increase those parameters.
  
<pre> g09 < test.g09 >test.gout </pre>
+
Like most programs, Gamess requires an '''input (.inp) file''' that describes the system (usually a molecule) for which the calculation will be performed, specifies the level of calculation (eg, CISD), and provides other necessary information (starting orbitals, basis sets, required properties, etc). The format of the input is considerably more demanding than the one required for Gaussian (another widely used electronic-structure program), and much less information is hidden inside of defaults. This makes Gamess a very flexible program, but increases the risk of doing something wrong. Careful study of example input files and the documentation is required to run Gamess successfully. This is particularly true for CI or CAS-SCF calculations.
  
This will only work if you are a member of the g98 group and have set the environment correctly. Note that the '''interactive execution of Gaussian is only meant for test runs'''.
+
Once an input file is prepared, you will have to make the decision if you want to run Gamess in serial or in parallel mode. Gamess supports the use of multiple processors. However, the scaling (ie, the efficiency of parallel processing) varies with the type of calculation and the systems. We suggest you perform a small test calculation of the same type as your production calculation (eg, with a minimal basis set), and rerun it several times with a varying number of processors. Compare the timings and use the maximum number of processors that yield acceptable scaling for your production calculation.
  
Gaussian input files are explained in the "User's Reference". It is impossible to give an outline here. Sample files can be found in
+
== Submitting (parallel) Gamess jobs ==
  
<pre>/opt/gaussian/g09/bsd/logs</pre>
+
Gamess has to be run via SLURM, which is a load-balancing program that submits batch jobs to low-load processors on the cluster cluster. [[SLURM|To learn more about this program, click here]]. A Gamess job is submitted to the Grid Engine in the form of an execution script. A reasonable execution script for Gamess looks like this:
  
'''Note:''' It is absolutely essential to have a good idea about the size and complexity of your calculations before you start a Gaussian job. Many of the methods mentioned above have terrible scaling properties, i.e. the computational cost grows very quickly with the number of electrons, degrees of freedom, or number of basis functions used. We suggest you start with a small basis set and a cheap method, and then slowly increase those parameters.
+
<pre>
 +
#!/bin/bash
 +
#SBATCH --output=STD.out
 +
#SBATCH --eror=STD.err
 +
#SBATCH --nodes=1
 +
#SBATCH --ntasks=1
 +
#SBATCH --cpus-per-task=8      # Number of CPUs
 +
#SBATCH --mem-per-cpu=4G        # memory per CPU in MB
 +
#SBATCH --time=0-00:30          # time (DD-HH:MM)
  
== Submitting (parallel) Gaussian jobs ==
+
export SLURM_CPUS_PER_TASK
  
If you want to run Gaussian on several processors on our machines, you have to include a line in your input file:
+
module load gamess-us/20170420-R1
 
+
rungms name.inp  &>  name.out
<pre>%nproc=8</pre>
+
 
+
where we assume that you want to use 8 processors (cores, threads).
+
 
+
'''It is mandatory to submit a Gaussian job script through our scheduling software''' (see our [[HowTo:Scheduler|Scheduler Help File]] for details).
+
 
+
Here is a "bare bones" sample of a Gaussian submission script:
+
 
+
<pre>
+
#! /bin/bash
+
#$ -S /bin/bash
+
#$ -q abaqus.q
+
#$ -l qname=abaqus.q
+
#$ -cwd
+
#$ -V
+
#$ -M hpcXXXX@localhost
+
#$ -m be
+
#$ -o STD.out
+
#$ -e STD.err
+
#$ -pe shm.pe 8
+
g09 < sample.g09 > sample.log
+
 
</pre>
 
</pre>
  
* The first 6 lines of the script make sure the right shell is used, the program executes on the correct cluster, and all necessary setup is done.
+
The --output and --error lines define the standard output and standard error files, respectively.  
* An email address for notifications is specified in the '''#$ -M''' line. We suggest "hpcXXXX@localhost" (hpcXXXX stands for the username). Place a file '''.forward''' containing your actual email address into your home directory. 
+
* The '''-o''' and '''-e''' lines are used to tell the system where to write "standard output" and "standard error", i.e. the screen output.
+
* The '''#$ -pe gaussian.pe 8''' line specifies the number of processors the scheduler will allocate (8 in this example). It is crucial to choose the same number as specified in the '''%nrocs=''' line of the input file.  
+
  
The script (let's call it g09.sh) is submitted by the qsub command:
+
Note that all lines starting with "#SBATCH" are directives for SLURM, and will be interpreted when the script is submitted to that program. You also need to specify the name of the input file just like in an interactive run.
  
<pre>qsub g09.sh</pre>
+
The number of processes is specified in the "--cpu-per-task" line, which instructs SLURM to allocate the proper number of CPUs for you on a single node. The --nodes and --ntasks lines should be kept at 1. You do not have to specify it separately in the rungms command line, because SLURM passes the $CPUS_PER_TASK variable.
  
This must be done from the working directory, i.e. the directory that contains the input file and is supposed to contain the output. Also make sure that you have set up gaussian ('''use g09''') before you submit a job.
+
Assuming your Grid Engine script is called "gamess.sh", it is submitted to GridEngine by typing
 +
<pre>sbatch gamess.sh</pre>
 +
No further specification of the output is necessary, since this is done inside the script and handled by GridEngine.
 
|}
 
|}
 
{|  style="border-spacing: 8px;"
 
{|  style="border-spacing: 8px;"
| valign="top" width="50%" style="padding:1em; border:1px solid #aaaaaa; background-color:#e1eaf1; border-radius:7px" |  
+
| valign="top" width="50%" style="padding:1em; border:1px solid #aaaaaa; background-color:#e1eaf1; border-radius:7px" |
 +
 
 
== Licensing ==
 
== Licensing ==
  
Gaussian is a licensed program. The license held by the Centre for Advanced Computing is limited to our computers at our main site. That means that any of our users can use the program on our machines (but nowhere else), whether they are located at Queen's or not.
+
Gamess is a licensed program although it is distributed freely. The license held by the Centre for Advanced Computing is limited to our computers at our main site. That means that any of our users can use the program on our machines (but nowhere else), whether they are located at Queen's or not. You are '''not''' allowed to copy the executable or any part of the distribution onto your local machine. However you can easily obtain the program yourself. See the [http://www.msg.ameslab.gov/gamess/download.html GAMESS source code distribution page]. Gamess is a very portable program, and will run on IBM PC's (Windows), on a Mac, a variety of Unix platforms (including Linux), and your cellphone (just kidding).
  
We require users of Gaussian to [http://www.hpcvl.org/sites/default/files/gaussian-statement.pdf sign a statement] in which they state that they are informed about the [http://www.hpcvl.org/sites/default/files/g09-licence.pdf terms of the license] to be included in the Gaussian user group named "g98". Please fax the completed statement to (613) 533-2015 or scan/email to [mailto:cac.admin@queensu.ca cac.admin@queensu.ca].
+
Before you can access the Gamess executables and run the program, you have to read the [http://www.msg.ameslab.gov/gamess/License_Agreement.html license agreement] and [http://cac.queensu.ca/wp-content/uploads/2017/04/cac-gamess-statement.pdf sign a statement] that you have done so. Return it to us (fax to (613) 533-2015 or scan/email to cac.admin@queensu.ca) and we add you to a Unix group "gamess" which enables access to the software.
  
 
== Where can I get more detailed information ? ==
 
== Where can I get more detailed information ? ==
 +
Gamess is not a simple program to run. It requires careful study of the input format, and a certain degree of knowledge about the "nuts and bolts" of computational quantum chemistry.
  
* To learn the basics about Gaussian input and output, refer to the [http://www.gaussian.com/g_tech/g_ur/g09help.htm Gaussian 09 User's Reference].
+
* It is impossible to use the program efficiently without reading the [http://www.msg.ameslab.gov/gamess/documentation.html user documentation].
* For templates, and to get many examples, check out /opt/gaussian/g09/bsd/examples.
+
* There is an official [http://www.msg.ameslab.gov/gamess/index.html Gamess homepage] with some information about capabilities of program, donloading a copy yourself, and the history of Gamess.  
* The [http://www.gaussian.com Gaussian web page] contains a lot of information.  
+
* '''[mailto:cac.help@queensu.ca Send email to cac.help@queensu.ca]'''. We're happy to help.
* For hardcore computational chemists, there is the [http://www.gaussian.com/g_tech/g09iop.htm Gaussian IOPs Reference], useful if you want to tinker with default settings and internal parameters.
+
* These [http://www.gaussian.com/g_prod/books.htm can be purchased from Gaussian Inc.].
+
* '''Send [mailto:cac.help@queensu.ca|email to cac.help@queensu.ca]'''. We're happy to help.
+
 
|}
 
|}

Latest revision as of 17:46, 13 March 2018

Gamess (US)

This is an introduction to the usage of the electronic-structure code "Gamess" on our clusters. It is meant as an initial pointer to more detailed information, and to get started. It does not replace study of the manual.

Features

The "General Atomic and Molecular Electronic Structure System" (GAMESS) is a quantum chemistry software package that was assembled from various older codes (in particular, HONDO) by M. Dupuis, D. Spangler, and J. J. Wendoloski of the National Resources for Computations in Chemistry (NRCC). The code has undergone great changes since then and is maintained now by the Gordon Research Group at Iowa State University.

Gamess performs a great many of standard quantum chemical calculations. These include:

  • RHF, UHF, ROHF, GVB, or MCSCF self-consistent calculations.
  • CI or MP2 corrections to the energy.
  • Semi-empirical MNDO, AM1, or PM3 methods.
  • Analytic energy gradients for SCF, MP2 or CI.
  • Geometry optimizations, saddle point searches, and vibrational frequencies.
  • Intrinsic reaction paths, gradient extremal curves, and dynmic reaction coordinates.
  • Many molecular properties, such as densities, electrostatic potentials, dipole moments, etc.
  • Modelling of solvent effects and electric fields.

For a complete list of capabilities of GAMESS, consult this table or look at the Gamess documentation.

Gamess is described in General Atomic and Molecular Electronic Structure System; M.W.Schmidt, K.K.Baldridge, J.A.Boatz, S.T.Elbert, M.S.Gordon, J.H.Jensen, S.Koseki, N.Matsunaga, K.A.Nguyen, S.Su, T.L.Windus, M.Dupuis, J.A.Montgomery; J. Comput. Chem. 14, 1347-1363 (1993)

and

Advances in electronic structure theory: GAMESS a decade later; M.S.Gordon, M.W.Schmidt pp. 1167-1189 in Theory and Applications of Computational Chemistry: the first forty years; C.E.Dykstra, G.Frenking, K.S.Kim, G.E.Scuseria (editors), Elsevier, Amsterdam, 2005.

Location of the program and setup

The program resides on the so-called CVMFS stack (provided by Compute Canada, contains most of our application software). You also find some test examples in the program directory, which are useful to get an idea of the input format for the program. You are not allowed to copy the executable or any part of the distribution onto your local machine. However you can easily obtain the program yourself. See the GAMESS source code distribution page.

Setup is done via the standard module load:

module load gamess-us

Scratch files

When you run Gamess a scratch space directory is set to /scratch/$USER and all temporary files and intermediate output will be placed in that directory. The program also requires a local temporary directory right below your home directory called $USER/scr. Make sure that you move files that you want to keep from there before running Gamess again with the same case_name. The second run requires that some temporary files be removed before re-run and will fail if they are still present in the scratch directory. For instance, if your username is "hpc9876", you will need "/scratch/hpc9876" and "/home/hpc9876/scr". Note that the former is automatically created when you obtain an account, but the latter has to be made by you explicitly.

Running Gamess from a command line

To run Gamess, a script file rungms is used. This file resides in the same directory as the gamess.00.x executable. Assuming that the home of the script file and executable is in your path, all you need to do is type

rungms case_name 01 n_procs

where case_name is the name of the input file (file extension is assumed to be .inp and must not be specified), and n_procs stands for the number of processes to be used in a parallel Gamess run. If n_procs=1 a serial run will be performed.

Note: It is absolutely essential to have a good idea about the size and complexity of your calculations before you start a Gamess job. Many of the methods have terrible scaling properties, i.e. the computational cost grows very quickly with the number of electrons, degrees of freedom, or number of basis functions used. We suggest you start with a small basis set and a cheap method, and then slowly increase those parameters.

Like most programs, Gamess requires an input (.inp) file that describes the system (usually a molecule) for which the calculation will be performed, specifies the level of calculation (eg, CISD), and provides other necessary information (starting orbitals, basis sets, required properties, etc). The format of the input is considerably more demanding than the one required for Gaussian (another widely used electronic-structure program), and much less information is hidden inside of defaults. This makes Gamess a very flexible program, but increases the risk of doing something wrong. Careful study of example input files and the documentation is required to run Gamess successfully. This is particularly true for CI or CAS-SCF calculations.

Once an input file is prepared, you will have to make the decision if you want to run Gamess in serial or in parallel mode. Gamess supports the use of multiple processors. However, the scaling (ie, the efficiency of parallel processing) varies with the type of calculation and the systems. We suggest you perform a small test calculation of the same type as your production calculation (eg, with a minimal basis set), and rerun it several times with a varying number of processors. Compare the timings and use the maximum number of processors that yield acceptable scaling for your production calculation.

Submitting (parallel) Gamess jobs

Gamess has to be run via SLURM, which is a load-balancing program that submits batch jobs to low-load processors on the cluster cluster. To learn more about this program, click here. A Gamess job is submitted to the Grid Engine in the form of an execution script. A reasonable execution script for Gamess looks like this:

#!/bin/bash
#SBATCH --output=STD.out
#SBATCH --eror=STD.err
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8       # Number of CPUs
#SBATCH --mem-per-cpu=4G        # memory per CPU in MB
#SBATCH --time=0-00:30          # time (DD-HH:MM)

export SLURM_CPUS_PER_TASK

module load gamess-us/20170420-R1
rungms name.inp  &>  name.out

The --output and --error lines define the standard output and standard error files, respectively.

Note that all lines starting with "#SBATCH" are directives for SLURM, and will be interpreted when the script is submitted to that program. You also need to specify the name of the input file just like in an interactive run.

The number of processes is specified in the "--cpu-per-task" line, which instructs SLURM to allocate the proper number of CPUs for you on a single node. The --nodes and --ntasks lines should be kept at 1. You do not have to specify it separately in the rungms command line, because SLURM passes the $CPUS_PER_TASK variable.

Assuming your Grid Engine script is called "gamess.sh", it is submitted to GridEngine by typing

sbatch gamess.sh

No further specification of the output is necessary, since this is done inside the script and handled by GridEngine.

Licensing

Gamess is a licensed program although it is distributed freely. The license held by the Centre for Advanced Computing is limited to our computers at our main site. That means that any of our users can use the program on our machines (but nowhere else), whether they are located at Queen's or not. You are not allowed to copy the executable or any part of the distribution onto your local machine. However you can easily obtain the program yourself. See the GAMESS source code distribution page. Gamess is a very portable program, and will run on IBM PC's (Windows), on a Mac, a variety of Unix platforms (including Linux), and your cellphone (just kidding).

Before you can access the Gamess executables and run the program, you have to read the license agreement and sign a statement that you have done so. Return it to us (fax to (613) 533-2015 or scan/email to cac.admin@queensu.ca) and we add you to a Unix group "gamess" which enables access to the software.

Where can I get more detailed information ?

Gamess is not a simple program to run. It requires careful study of the input format, and a certain degree of knowledge about the "nuts and bolts" of computational quantum chemistry.