Difference between revisions of "HowTo:nwchem"

From CAC Wiki
Jump to: navigation, search
(Created page with "= Gaussian = This is a short "Frequently Asked Questions" file on using the parallel electronic-structure code "NWChem" on systems at the Centre for Advanced Computing. This...")
 
(Gaussian)
Line 1: Line 1:
= Gaussian =
+
= NWChem =
  
 
This is a short "Frequently Asked Questions" file on using the parallel electronic-structure code "NWChem" on systems at the Centre for Advanced Computing. This software uses MPI as a message-passing system and is (in principle) able to run on an arbitrary number of processors. Its ability to perform a very broad spectrum of molecular-structure calculations, ranging from CI to ab-initio molecular dynamics, makes it an interesting alternative to the standard electronic structure code Gaussian.  
 
This is a short "Frequently Asked Questions" file on using the parallel electronic-structure code "NWChem" on systems at the Centre for Advanced Computing. This software uses MPI as a message-passing system and is (in principle) able to run on an arbitrary number of processors. Its ability to perform a very broad spectrum of molecular-structure calculations, ranging from CI to ab-initio molecular dynamics, makes it an interesting alternative to the standard electronic structure code Gaussian.  

Revision as of 17:32, 27 May 2016

NWChem

This is a short "Frequently Asked Questions" file on using the parallel electronic-structure code "NWChem" on systems at the Centre for Advanced Computing. This software uses MPI as a message-passing system and is (in principle) able to run on an arbitrary number of processors. Its ability to perform a very broad spectrum of molecular-structure calculations, ranging from CI to ab-initio molecular dynamics, makes it an interesting alternative to the standard electronic structure code Gaussian.

Features

NWChem is an electronic-structure code that is suitable to perform complex calculations on molecular structure. It was specifically designed to perform well on high-performance parallel computers. The installation on the SunFire cluster of HPCVL employs the MPI message passing package for parallel execution.

NWChem allows, among others, the following calculations to be performed:

  • Hartree-Fock (e.g. RHF, UHF, ROHF etc.)
  • DFT including spin-orbit DFT, with many exchange and correlation functionals.
  • Complete Active Space SCF (CAS-SCF)
  • Coupled-Cluster (CCSD, CCSD+T, etc.)
  • limited CI (eg, CISD) with perturbative corrections
  • MP2 (2nd-order Mollar-Plesset Perturbation Theory)
  • In general: single-point calculations, geometry optimizations, vibrational analysis.
  • Static one-electron properties, densities, electrostatic potentials.
  • ONIOM model for multi-level calculations on larger systems
  • Relativistic corrections (Douglas-Kroll, Dyall-Dirac, spin-orbit)
  • Ab-initio molecular dynamics (Carr-Parinello)
  • Extended (solid-state) systems DFT
  • Classical force-fields (Molecular Mechanics: AMBER, CHARMM, etc.)

For a more complete list, see the official NWChem homepage and click on "capabilities".

Location of the program and setup

The NWChem program is located in the directory /opt/nwchem/bin. To access it, you have to use the usepackage command

use nwchem

which will set you up automatically.

Scratch files

One of the settings is the environment variable GAUSS_SCRDIR which is required to redirect the temporary files that Gaussian uses to the proper scratch space, presently

export GAUSS_SCRDIR=/scratch/hpcXXXX

where hpcXXXX stands for your username. If for some reason Gaussian does not terminate normally (e.g. a job gets cancelled), it leaves behind large scratch files which you may have to delete manually. To check if such files exist, type

ls -lt /scratch/hpcXXXX

Once you have determined that the scratch files are no longer needed (because the program that used them is not running any more), you can delete them by typing

rm /scratch/hpcxxxx/Gau-*

Cleaning up the scratch space is the user's responsibility. If it is not done regularly, it can cause jobs to terminate, and much work to be lost.

Running Gaussian from a command line

To run Gaussian on our systems, you have to belong to a user group g98 (it's called that for historical reason, but it applies to all versions of Gaussian). You need to read our license agreement and signed a statement to be included in this user group. Once you are, you can access the executables.

A computation is performed by preparing an input file and pipe it to standard input of the program g09. Standard output should be caught in a log-file. We suggest you use the extensions .g09 for input files and .log for results.

Interactively, the command line to invoke Gaussian is thus:

 g09 < test.g09 >test.gout 

This will only work if you are a member of the g98 group and have set the environment correctly. Note that the interactive execution of Gaussian is only meant for test runs.

Gaussian input files are explained in the "User's Reference". It is impossible to give an outline here. Sample files can be found in

/opt/gaussian/g09/bsd/logs

Note: It is absolutely essential to have a good idea about the size and complexity of your calculations before you start a Gaussian job. Many of the methods mentioned above have terrible scaling properties, i.e. the computational cost grows very quickly with the number of electrons, degrees of freedom, or number of basis functions used. We suggest you start with a small basis set and a cheap method, and then slowly increase those parameters.

Submitting (parallel) Gaussian jobs

If you want to run Gaussian on several processors on our machines, you have to include a line in your input file:

%nproc=8

where we assume that you want to use 8 processors (cores, threads).

It is mandatory to submit a Gaussian job script through our scheduling software (see our Scheduler Help File for details).

Here is a "bare bones" sample of a Gaussian submission script:

#! /bin/bash
#$ -S /bin/bash
#$ -q abaqus.q
#$ -l qname=abaqus.q
#$ -cwd
#$ -V
#$ -M hpcXXXX@localhost
#$ -m be
#$ -o STD.out
#$ -e STD.err
#$ -pe shm.pe 8
g09 < sample.g09 > sample.log
  • The first 6 lines of the script make sure the right shell is used, the program executes on the correct cluster, and all necessary setup is done.
  • An email address for notifications is specified in the #$ -M line. We suggest "hpcXXXX@localhost" (hpcXXXX stands for the username). Place a file .forward containing your actual email address into your home directory.
  • The -o and -e lines are used to tell the system where to write "standard output" and "standard error", i.e. the screen output.
  • The #$ -pe gaussian.pe 8 line specifies the number of processors the scheduler will allocate (8 in this example). It is crucial to choose the same number as specified in the %nrocs= line of the input file.

The script (let's call it g09.sh) is submitted by the qsub command:

qsub g09.sh

This must be done from the working directory, i.e. the directory that contains the input file and is supposed to contain the output. Also make sure that you have set up gaussian (use g09) before you submit a job.

Licensing

Gaussian is a licensed program. The license held by the Centre for Advanced Computing is limited to our computers at our main site. That means that any of our users can use the program on our machines (but nowhere else), whether they are located at Queen's or not.

We require users of Gaussian to sign a statement in which they state that they are informed about the terms of the license to be included in the Gaussian user group named "g98". Please fax the completed statement to (613) 533-2015 or scan/email to cac.admin@queensu.ca.

Where can I get more detailed information ?