Difference between revisions of "HowTo:namd"

From CAC Wiki
Jump to: navigation, search
(Submitting (parallel) NAMD jobs)
 
(11 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
= NAMD =
 
= NAMD =
  
This is a short help file on using the parallel electronic-structure code "NWChem" on systems at the Centre for Advanced Computing. This software uses MPI as a message-passing system and is (in principle) able to run on an arbitrary number of processors. Its ability to perform a very broad spectrum of molecular-structure calculations, ranging from CI to ab-initio molecular dynamics, makes it an interesting alternative to the standard electronic structure code Gaussian.  
+
This is a quick introduction to the usage of the free but licensed code NAMD2 that is installed on our clusters. It is meant as an initial pointer to more detailed information and a Getting Started primer. It does not replace study of the manual.
 +
 
 
{|  style="border-spacing: 8px;"
 
{|  style="border-spacing: 8px;"
 
| valign="top" width="50%" style="padding:1em; border:1px solid #aaaaaa; background-color:#e1eaf1; border-radius:7px" |  
 
| valign="top" width="50%" style="padding:1em; border:1px solid #aaaaaa; background-color:#e1eaf1; border-radius:7px" |  
 
== Features ==
 
== Features ==
  
NWChem is an electronic-structure code that is suitable to perform complex calculations on molecular structure. It was specifically designed to perform well on high-performance parallel computers. The installation on the SunFire cluster of HPCVL employs the MPI message passing package for parallel execution.
+
NAMD is a parallel code for molecular dynamics simulation of large biomolecular systems, developed by the Theoretical Biophysics Group ("TBG") in the Beckman Institute of the University of Illinois . It is file-compatible with AMBER, CHARMM, and X-PLOR.
 
+
NWChem allows, among others, the following calculations to be performed:
+
* Hartree-Fock (e.g. RHF, UHF, ROHF etc.)
+
* DFT including spin-orbit DFT, with many exchange and correlation functionals.
+
* Complete Active Space SCF (CAS-SCF)
+
* Coupled-Cluster (CCSD, CCSD+T, etc.)
+
* limited CI (eg, CISD) with perturbative corrections
+
* MP2 (2nd-order Mollar-Plesset Perturbation Theory)
+
* In general: single-point calculations, geometry optimizations, vibrational analysis.
+
* Static one-electron properties, densities, electrostatic potentials.
+
* ONIOM model for multi-level calculations on larger systems
+
* Relativistic corrections (Douglas-Kroll, Dyall-Dirac, spin-orbit)
+
* Ab-initio molecular dynamics (Carr-Parinello)
+
* Extended (solid-state) systems DFT
+
* Classical force-fields (Molecular Mechanics: AMBER, CHARMM, etc.)
+
For a more complete list, see the official NWChem homepage and click on "capabilities".
+
  
 
== Location of the program and setup ==
 
== Location of the program and setup ==
  
The NWChem program is located in the directory /opt/nwchem/bin. To access it, you have to use the usepackage command
+
The binary executable is in /opt/namd. The present version of the program is 2.10, and it is available on the Linux platform in its 64 bit version. Therefore, all the relevant executables are in /opt/namd/2.10. Documentation can be found at the main NAMD site and a simple example (Alanin) is in /opt/namd/2.10/example.
<pre>use nwchem</pre>
+
 
which will set you up automatically.
+
The setup for NAMD is very simple. It is only necessary type :
 +
<pre>use namd</pre>
 +
This will enter the proper directory into your PATH and off you go.
 
|}
 
|}
 
{|  style="border-spacing: 8px;"
 
{|  style="border-spacing: 8px;"
| valign="top" width="50%" style="padding:1em; border:1px solid #aaaaaa; background-color:#f7f7f7; border-radius:7px" |  
+
| valign="top" width="50%" style="padding:1em; border:1px solid #aaaaaa; background-color:#f7f7f7; border-radius:7px" |
  
 
== Running NWChem from a command line==
 
== Running NWChem from a command line==
  
Like other electronic-structure programs, NWChem is run by supplying an input file that defines the system on which to perform a calculation (usually a molecule, or a group of molecules), and the method to use (i.e., the level of calculation, such as "Hartree-Fock", the basis set, and other details of the computation).
+
NAMD requires a number of input files to run. These include:
 +
* A so-called "configuration file" that declares the initial configuration for a molecular dynamics run, the force field files, number of steps in the simulation, etc.
 +
* A coordinate file that gives the coordinates of the participating atoms or molecules.
 +
* A parameter file declaring bond-lengths, angles, dihedrals, non-bonded parameters etc.
 +
* A force-field file declaring parameters associated with atomic and molecular interactions.
 +
Details about the supported format of these input files can be found in the NAMD User's Guide.
  
The variety of possible calculations is great, and so is the complexity of systems. It is impossible for us here to explain the format that a NWChem input file needs to have. This is explained in the [http://www.nwchem-sw.org/index.php/Release62:NWChem_Documentation User's Manual] which is available online.  
+
NAMD supports several running modes. In the simplest case, it can be run in serial mode by typing:
 +
<pre>namd2 config_file</pre>
 +
where config_file is the configuration input file mentioned above. It is recommended to give the configuration file the file extension .namd to enable consistent naming of the output files. These will be generated automatically, and the progress of the program run will be tracked on the screen.
  
Here, we provide a simple sample input file which should be given the file extension .nw.
+
NAMD is also able to run in parallel mode. For our shared-memory systems, it is easiest to run it by specifying the number of threads through the +p option:
 +
<pre>namd2 +pN config_file</pre>
 +
if N threads are requested.
  
<pre>
+
== Submitting (parallel) NAMD jobs ==
start h2o
+
  
title "H2O, cc-pVDZ basis, SCF optimized geometry"
+
Only short test jobs of application software can be run interactively on our machines. Production jobs must be submitted via the scheduling software Grid Engine. For usage of this software, please consult our [[FAQ:SGE|Grid Engine FAQ]].
  
geometry units au
+
In most cases, you will be running NAMD production jobs in parallel mode. This means that you need to specify a number of CPUs that should be reserved to run each independent NAMD thread. This is done in a Grid Engine submission script:
H      0.0000000000  1.4140780900  -1.1031626600
+
H      0.0000000000  -1.4140780900  -1.1031626600
+
O      0.0000000000  0.0000000000  -0.0080100000
+
end
+
  
basis
 
H library cc-pVDZ
 
O library cc-pVDZ
 
end
 
 
scf
 
  thresh 1.0e-8
 
end
 
 
task scf
 
</pre>
 
 
This extension may be omitted when calling the program. NWChem creates typically a whole array of output files that are documented in the User's Manual. A general log is displayed on the console, and may be saved in a file by redirecting the standard output:
 
 
<pre>nwchem sample > sample.log</pre>
 
 
where we assume that your input file is called sample.nw and you want to save the log to a file sample.log.
 
 
Note that this is just the basic syntax of the program call. In practise you will use a parallel environment to execute the program (see next section). In fact, executing NWChem by just typing the above line will run it in serial mode.
 
 
== Parallel Runs ==
 
 
NWChem is inherently parallelized and designed to scale well on a multi-processor machine or a cluster. The underlying parallel system is MPI (Message Passing Interface) which is a commonly available standard that runs on many platforms. Consult our [[HowTo:mpi|MPI help file]] and follow some of the links in there if you want to have more information about MPI. Even if you want to use only one processor for your NWChem run (which sometimes is the best solution, particularly for smaller computations), you have to submit the program to a parallel environment. On our clusters, the relevant command is '''mpirun''':
 
<pre>mpirun -np 8 nwchem sample > sample.log</pre>
 
This will run your sample.nw input file on eight processors. Note that you are only allowed to run NWChem this way for small test systems! For any production jobs, you have to submit the task to the scheduler (see next section.).
 
 
== Submitting (parallel) NWChem jobs ==
 
 
NWChem jobs are to be submitted on the SW (Linux) systems via the GridEngine, which is a load-balancing software. To obtain details, read our [[FAQ:SGE|GridEngine FAQ]] . For an NWChem batch job, this means that rather than issuing the command in the previous section directly, you wrap it into a GridEngine batch script.
 
 
Here is an example for such a batch script:
 
 
<pre>
 
<pre>
 +
#!/bin/bash
 
#$ -S /bin/bash
 
#$ -S /bin/bash
#$ -q abaqus.q
 
#$ -l qname=abaqus.q
 
#$ -o sample.out
 
#$ -e STD.err
 
#$ -M hpcXXXX@localhost
 
#$ -m be
 
 
#$ -V
 
#$ -V
 
#$ -cwd
 
#$ -cwd
#$ -pe dist.pe 4
+
#$ -M {email address}
mpirun -np $NSLOTS nwchem sample
+
#$ -m be
</pre>  
+
#$ -o {screen output file}
This script needs to be altered by explicitly replacing the entries that differ in your case. We suggest you use it as a template for all your NWChem runs. For details, consult our [[HowTo:Scheduler|Scheduler Help File]].
+
#$ -e {screen error file}
 +
#$ -pe shm.pe {number of threads}
 +
namd2 +p$NSLOTS {namd configuration (input) file}
 +
</pre>
  
Note that there is no need in this script to redirect the standard output via the > operator. Instead, you define where the output goes to the GridEngine via the "#$ -o" command. In our case, we send it to a file called sample.out.  
+
The items in the template that are enclosed in {} be replaced by the appropriate values. Lines that start with "#$" contain information for Grid Engine. The "#$ -V" line tells GE to inherit the shell setup from the calling shell, for instance the $PATH variable. It is important to remember that you need to set up NAMD by issuing the "use namd" command before submitting the above script.  
  
Email notification is set up through the "#$ -M" line. In the above example you need to replace XXXX by the actual 4 digits in your username, and place a file ".forward" with your email address into your home directory.
+
"#$ -cwd" tells the system to start from the current working directory. "#$ -M" lets the system know you email address, so it can notify you when the job starts and ends. The "#$ -o" and "#$ -e" lines are there to define files that capture output that would go to the screen in an interactive run, coming from the program and the system, repsectively. Finally, the "#$ -pe" line serves to define the number of CPUs to be reserved. The number you insert here will be reused through the environment variable $NSLOTS, so that you do not have to type it again in the namd2 command line.
  
In the example we are executing with 4 processes. To choose a different number, alter the "#$ -pe" line in the script. For this example script to work you need to have set up the calling shell through the "use nwchem" command because the above script inherits all the environment settings (due to the "#$ -V" option).
+
Note that the name of the configuration file that replaces "configuration file" in the script template, should have file extension .namd, just as in the interactive run.
  
The script is submitted to the GridEngine by typing
+
Once you have a proper script file (let's call it namd.sh) you can submit your production job by typing
  
<pre>qsub batch_file_name</pre>
+
<pre>qsub namd.sh</pre>
 +
 
 +
The Grid Engine will take care of the rest.
  
The advantage to submit jobs via a load balancing software is that the software will automatically find the resources required and put the job onto a set of processors that have a low load. This will help executing the job faster. Production jobs on our cluster '''must be submitted using GridEngine''' from a login node (sflogin0 or swlogin1), and executed under GE's control on the Linux production nodes without any need for you to log in.
 
 
|}
 
|}
 
{|  style="border-spacing: 8px;"
 
{|  style="border-spacing: 8px;"
Line 114: Line 73:
  
 
== Licensing ==
 
== Licensing ==
NWChem is obtainable free of charge from the Pacific Northwest National Laboratory. [http://www.nwchem-sw.org/index.php/Download To obtain your own copy, go here]. NWChem is ditributed under an [http://opensource.org/licenses/ecl2.php Open Source Educational Community License]. Like with other software, HPCVL requires users who want to use NWChem, to read this agreement, and sign [http://www.hpcvl.org/sites/default/files/hpcvl%20nwchem_statement.pdf a statement] that they have done so and will abide by its terms. You can fax the signed statement to (613) 533-2015 or scan/email it to [mailto:cac.admin@queensu.ca cac.admin@queensu.ca]. You will then be included in a Unix user group that has access to the NWChem executables.
+
NAMD is free for non-commercial use, but it is licensed. As with other licensed software, we ask our users to read through the license agreement that we have with the University of Illinois, and to [http://www.hpcvl.org/sites/default/files/hpvcl_namd_statement.pdf sign a statement] that they agree to abide by the terms of the license. The main issue in the NAMD case is that usage has to non-commercial.
 +
 
 +
Once we have received the signed statement (FAX to (613) 533-2015 or scan/email to cac.admin@queensu.ca), we will enter the user to a Unix group namd which enables access to the software.
  
 
== More Information ==
 
== More Information ==
  
NWChem is a very complex software package, and requires practice to be used efficiently. We cannot explain it use in any detail here.
+
NAMD requires practice to be used efficiently. We cannot explain it use in any detail here, but
  
* Complete documentation for the program is available in the form of the [http://www.nwchem-sw.org/index.php/Release62:NWChem_Documentation User's Manual], which is an absolute must-have if you want to use this program.  
+
* Complete documentation for the program is available in the form of the [http://www.ks.uiuc.edu/Research/namd/documentation.html User's Guide], which is an absolute must-have if you want to use this program.  
* Check out the [http://www.nwchem-sw.org/index.php/Main_Page official NWChem website]. They feature a very useful FAQ and even a tutorial.
+
* Check out the [http://www.ks.uiuc.edu/Research/namd/ NAMD website]. They feature a very useful FAQ and even a tutorial.
* There is an active support community for NWChem which can be [http://www.nwchem-sw.org/index.php/Special:AWCforum accessed through their webpage].
+
* There is an active [http://www.ks.uiuc.edu/Research/namd/mailing_list/ NAMD Mailing List].
* If you are experiencing trouble running a batch command script, [[FAQ:SGE|read our FAQ on that subject]]
+
 
* '''Send [mailto:cac.help@queensu.ca|email to cac.help@queensu.ca]'''. We're happy to help.
 
* '''Send [mailto:cac.help@queensu.ca|email to cac.help@queensu.ca]'''. We're happy to help.
 
|}
 
|}

Latest revision as of 17:55, 16 May 2017

NAMD

This is a quick introduction to the usage of the free but licensed code NAMD2 that is installed on our clusters. It is meant as an initial pointer to more detailed information and a Getting Started primer. It does not replace study of the manual.

Features

NAMD is a parallel code for molecular dynamics simulation of large biomolecular systems, developed by the Theoretical Biophysics Group ("TBG") in the Beckman Institute of the University of Illinois . It is file-compatible with AMBER, CHARMM, and X-PLOR.

Location of the program and setup

The binary executable is in /opt/namd. The present version of the program is 2.10, and it is available on the Linux platform in its 64 bit version. Therefore, all the relevant executables are in /opt/namd/2.10. Documentation can be found at the main NAMD site and a simple example (Alanin) is in /opt/namd/2.10/example.

The setup for NAMD is very simple. It is only necessary type :

use namd

This will enter the proper directory into your PATH and off you go.

Running NWChem from a command line

NAMD requires a number of input files to run. These include:

  • A so-called "configuration file" that declares the initial configuration for a molecular dynamics run, the force field files, number of steps in the simulation, etc.
  • A coordinate file that gives the coordinates of the participating atoms or molecules.
  • A parameter file declaring bond-lengths, angles, dihedrals, non-bonded parameters etc.
  • A force-field file declaring parameters associated with atomic and molecular interactions.

Details about the supported format of these input files can be found in the NAMD User's Guide.

NAMD supports several running modes. In the simplest case, it can be run in serial mode by typing:

namd2 config_file

where config_file is the configuration input file mentioned above. It is recommended to give the configuration file the file extension .namd to enable consistent naming of the output files. These will be generated automatically, and the progress of the program run will be tracked on the screen.

NAMD is also able to run in parallel mode. For our shared-memory systems, it is easiest to run it by specifying the number of threads through the +p option:

namd2 +pN config_file

if N threads are requested.

Submitting (parallel) NAMD jobs

Only short test jobs of application software can be run interactively on our machines. Production jobs must be submitted via the scheduling software Grid Engine. For usage of this software, please consult our Grid Engine FAQ.

In most cases, you will be running NAMD production jobs in parallel mode. This means that you need to specify a number of CPUs that should be reserved to run each independent NAMD thread. This is done in a Grid Engine submission script:

#!/bin/bash
#$ -S /bin/bash
#$ -V
#$ -cwd
#$ -M {email address}
#$ -m be
#$ -o {screen output file}
#$ -e {screen error file}
#$ -pe shm.pe {number of threads}
namd2 +p$NSLOTS {namd configuration (input) file}

The items in the template that are enclosed in {} be replaced by the appropriate values. Lines that start with "#$" contain information for Grid Engine. The "#$ -V" line tells GE to inherit the shell setup from the calling shell, for instance the $PATH variable. It is important to remember that you need to set up NAMD by issuing the "use namd" command before submitting the above script.

"#$ -cwd" tells the system to start from the current working directory. "#$ -M" lets the system know you email address, so it can notify you when the job starts and ends. The "#$ -o" and "#$ -e" lines are there to define files that capture output that would go to the screen in an interactive run, coming from the program and the system, repsectively. Finally, the "#$ -pe" line serves to define the number of CPUs to be reserved. The number you insert here will be reused through the environment variable $NSLOTS, so that you do not have to type it again in the namd2 command line.

Note that the name of the configuration file that replaces "configuration file" in the script template, should have file extension .namd, just as in the interactive run.

Once you have a proper script file (let's call it namd.sh) you can submit your production job by typing

qsub namd.sh

The Grid Engine will take care of the rest.

Licensing

NAMD is free for non-commercial use, but it is licensed. As with other licensed software, we ask our users to read through the license agreement that we have with the University of Illinois, and to sign a statement that they agree to abide by the terms of the license. The main issue in the NAMD case is that usage has to non-commercial.

Once we have received the signed statement (FAX to (613) 533-2015 or scan/email to cac.admin@queensu.ca), we will enter the user to a Unix group namd which enables access to the software.

More Information

NAMD requires practice to be used efficiently. We cannot explain it use in any detail here, but

  • Complete documentation for the program is available in the form of the User's Guide, which is an absolute must-have if you want to use this program.
  • Check out the NAMD website. They feature a very useful FAQ and even a tutorial.
  • There is an active NAMD Mailing List.
  • Send to cac.help@queensu.ca. We're happy to help.