Difference between revisions of "HowTo:petsc"

From CAC Wiki
Jump to: navigation, search
(Running ADF from a command line)
m (Hasch moved page FAQ:PETSc to HowTo:petsc)
 
(9 intermediate revisions by the same user not shown)
Line 6: Line 6:
 
| valign="top" width="50%" style="padding:1em; border:1px solid #aaaaaa; background-color:#e1eaf1; border-radius:7px" |  
 
| valign="top" width="50%" style="padding:1em; border:1px solid #aaaaaa; background-color:#e1eaf1; border-radius:7px" |  
 
== Features ==
 
== Features ==
From the PETSc web page: "PETSc, pronounced PET-see (the S is silent), is a suite of data structures and routines for the scalable (parallel) solution of scientific applications modeled by partial differential equations. It supports MPI, shared memory pthreads, and GPUs through CUDA or OpenCL, as well as hybrid MPI-shared memory pthreads or MPI-GPU parallelism." (Remark: Since HPCVL does presently not operate any GPUs the corresponding features are not implemented in our version).
+
From the PETSc web page: "PETSc, pronounced PET-see (the S is silent), is a suite of data structures and routines for the scalable (parallel) solution of scientific applications modeled by partial differential equations. It supports MPI, shared memory pthreads, and GPUs through CUDA or OpenCL, as well as hybrid MPI-shared memory pthreads or MPI-GPU parallelism." (Remark: Since we do not operate any GPUs at present, the corresponding features are not implemented in our version).
  
 
"PETSc is intended for use in large-scale application projects, and is easy to use for beginners. Its careful design allows advanced users to have detailed control over the solution process. PETSc includes a large suite of parallel linear and non-linear equation solvers and ODE integrators that are easily used in application codes written in C, C++, and Fortran. PETSc provides many of the mechanisms needed within parallel application codes, such as simple parallel matrix and vector assembly routines that allow the overlap of communication and computation. In addition, PETSc includes support for parallel distributed arrays useful for finite difference methods."
 
"PETSc is intended for use in large-scale application projects, and is easy to use for beginners. Its careful design allows advanced users to have detailed control over the solution process. PETSc includes a large suite of parallel linear and non-linear equation solvers and ODE integrators that are easily used in application codes written in C, C++, and Fortran. PETSc provides many of the mechanisms needed within parallel application codes, such as simple parallel matrix and vector assembly routines that allow the overlap of communication and computation. In addition, PETSc includes support for parallel distributed arrays useful for finite difference methods."
Line 57: Line 57:
 
{|  style="border-spacing: 8px;"
 
{|  style="border-spacing: 8px;"
 
| valign="top" width="50%" style="padding:1em; border:1px solid #aaaaaa; background-color:#f7f7f7; border-radius:7px" |
 
| valign="top" width="50%" style="padding:1em; border:1px solid #aaaaaa; background-color:#f7f7f7; border-radius:7px" |
== Usage in programming ==
+
== Writing a PETSc program ==
 
PETSc provides a programming framework that lets you solve rather complex scientific problems with a minimum of coding. It is essential to study the [http://www.mcs.anl.gov/petsc/petsc-current/docs/manual.pdf User's Manual] thoroughly to learn how to write programs that use PETSc.
 
PETSc provides a programming framework that lets you solve rather complex scientific problems with a minimum of coding. It is essential to study the [http://www.mcs.anl.gov/petsc/petsc-current/docs/manual.pdf User's Manual] thoroughly to learn how to write programs that use PETSc.
  
Line 105: Line 105:
 
The "First Block" creates a parallel vector x and defines which parts of the vector reside on which process. The PETSc specific communicator PETSC_COMM_WORLD is used for this. The "Second Code Block" then assigns specific values to the elements. This is done on the '''global''' version of the vector, and PETSc takes care of the details, for instance which process handles which element. After that, the vector is "assembled", i.e. distributed among the processes. This involves two function calls (Begin and End) to make it possible to "do other things" in the meantime. Just make sure it does not involve the vector. Finally (in the "Third Block"), we print out the vector and get rid of it. After that, we finalize PETSc usage and return.
 
The "First Block" creates a parallel vector x and defines which parts of the vector reside on which process. The PETSc specific communicator PETSC_COMM_WORLD is used for this. The "Second Code Block" then assigns specific values to the elements. This is done on the '''global''' version of the vector, and PETSc takes care of the details, for instance which process handles which element. After that, the vector is "assembled", i.e. distributed among the processes. This involves two function calls (Begin and End) to make it possible to "do other things" in the meantime. Just make sure it does not involve the vector. Finally (in the "Third Block"), we print out the vector and get rid of it. After that, we finalize PETSc usage and return.
  
== Submitting (parallel) ADF jobs ==
+
== Compiling a PETSc program ==
 +
Due to the large number of PETSc specific variables and options, it is best to do compilation using a makefile. This allows to apply all these setting through an "include" statement. Here is a bare-bones makefile without any additional optimization or special options that will compile the above sample program:
  
In most cases, you will run ADF in batch mode.
+
<pre>
 +
CFLAGS =
 +
include ${PETSC_DIR}/conf/variables
 +
include ${PETSC_DIR}/conf/rules
  
Production jobs are submitted to our systems via the Grid Engine, which is a load-balancing software. To obtain details, read our Grid Engine FAQ. For an ADF batch job, this means that rather than issuing the above commands directly, you wrap them into a Grid Engine batch script. Here is an example for such a batch script:
+
sample : sample.o  chkopts
 
+
        -${CLINKER} -o sample sample.o  ${PETSC_VEC_LIB}
<pre>
+
#! /bin/bash
+
#$ -S /bin/bash
+
#$ -V
+
#$ -cwd
+
#$ -M MyEmailAdress@whatever.com
+
#$ -m be
+
#$ -o STD.out
+
#$ -e STD.err
+
#$ -pe shm.pe 12
+
adf -n $NSLOTS <sample.adf >sample.log
+
 
</pre>
 
</pre>
  
This script needs to be altered by replacing all the relevant items. It sets all the necessary environment variables (make sure you issued a "use adf" statement before using this), and then starts the program. The lines in the script that start with #$ are interpreted the Grid Engine load balancing software as directives for the execution of the program.
+
The CFLAGS variable allows to pass additional compiler options, but we don't do that in our case. The include statements pull in the necessary options and variables for the compiler and "teaches" the make facility of how to get from PETSc-based C code to properly compiled object files. When executed through the make command, the code is compiled and then linked with the proper libraries:
  
For instance the line "#$ -m be" tells the Grid Engine to notify the user via email when the job has started and when it is finished, while the line beginning with "#$ -M" tells the Grid Engine about the email address of the user.
+
<pre>
 +
hasch@swlogin1$ make sample
  
The -o and -e lines determine whence the standard input and the standard error are to be redirected. Since the job is going to be executed in batch, no terminal is available as a default for these.  
+
/opt/openmpi-1.8/bin/mpicc -o sample.o -c -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0  \
 +
-I/opt/petsc/3.5.2/include -I/opt/petsc/3.5.2/include -I/opt/openmpi-1.8/include `pwd`/sample.c
  
The ADF package is able to execute on several processors simultaneously in a distributed-memory fashion. This means that some tasks such as the calculation of a large number of matrix elements, or numerical integrations may be done in a fraction of the time it takes to execute on a single CPU. For this, the processors on the cluster need to be able to communicate. To this end ADF uses the MPI (Message Passing Interface), a well-established communication system.
+
/opt/openmpi-1.8/bin/mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0  -o sample sample.o  \
 +
-Wl,-rpath,/opt/petsc/3.5.2/lib -L/opt/petsc/3.5.2/lib  -lpetsc -Wl,-rpath,/usr/lib64/atlas -L/usr/lib64/atlas -llapack -lf77blas \
 +
-latlas -lX11 -lpthread -lssl -lcrypto -Wl,-rpath,/opt/openmpi-1.8/lib -L/opt/openmpi-1.8/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.7 \
 +
-L/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -lmpi_usempi -lmpi_mpifh -lgfortran -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/opt/openmpi-1.8/lib \
 +
-L/opt/openmpi-1.8/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -Wl,-rpath,/opt/openmpi-1.8/lib \
 +
-Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -lmpi_cxx -lstdc++ -Wl,-rpath,/opt/openmpi-1.8/lib -L/opt/openmpi-1.8/lib \
 +
-Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.7 \
 +
-L/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -ldl -Wl,-rpath,/opt/openmpi-1.8/lib -lmpi -lgcc_s -lpthread -ldl
 +
</pre>
 +
An impressive list of options is revealed which you really don't want to type in manually. This produces an executable "sample" and an object file "sample.o".
  
Because ADF uses a specific version of the parallel system MPI (ClusterTools 7), executing the use adf command will also cause the system to "switch" to that version, which might have an impact on jobs that you are running from the same shell later. To undo this effect, you need to type use ct8 when you are finished using ADF and want to return to the production version of MPI (ClusterTools 8).
+
== Running a PETSc program ==
 +
Usually, once a PETSc program has been compiled, it can be executed by the standard mpiexec command. On our systems you have to make sure you are using our OpenMPI 1.8 installation that PETSc was installed with; if not done already, "use openmpi-1.8" should do the trick. Executing the "sample" program in the above example with two processes, we get something like:
  
ADF parallel jobs that are to be submitted to Grid Engine will use the MPI parallel environment and queues already defined for the user.
+
<pre>$ mpiexec -n 2 ./sample
 +
Vec Object: 2 MPI processes
 +
  type: mpi
 +
Process [0]
 +
3
 +
Process [1]
 +
3
 +
2</pre>
  
Our sample script contains a line that determines the number of parallel processes to be used by ADF. The Grid Engine will start the MPI parallel environment (PE) with a given number of slots that you specify by modifying that line:
+
The "$" is supposed to be a command prompt. More elegantly, we can add two lines to the makefile, for instance
  
<pre>#$ -pe shm.pe ''number of processes''</pre>
+
<pre>run :
 +
        ${MPIEXEC} -n 2 ./sample</pre>
  
where ''number of processes'' must be replaced (for instance, by 12 in our example above). It then determines the value of the environment variable NSLOTS which is used in the "adf" line of the sample script. This way, the system allocates exactly the number of processors that are used for the adf run, and no mismatch can occur.
+
and execute them through the make command
  
Once properly modified, the script (let's call it "adf.sh") can be submitted to the Grid Engine by typing
+
<pre>$ make run
 +
/opt/openmpi-1.8/bin/mpiexec -n 2 ./sample
 +
Vec Object: 2 MPI processes
 +
  type: mpi
 +
Process [0]
 +
3
 +
Process [1]
 +
3
 +
2</pre>
  
<pre>qsub adf.sh</pre>
+
In this case we used a PETSc environment variable MPIEXEC which is preset in the include files.
 
+
The advantage to submit jobs via a load balancing software is that the software will automatically find the resources required and put the job onto a node that has a low load. This will help executing the job faster. Note that the usage of Grid Engine for all production jobs on HPCVL clusters is mandatory. Production jobs that are submitted outside of the load balancing software will be terminated by the system administrator.
+
 
+
Luckily, there is an easier way to do all this: We are supplying a small perl script called that can be called directly, and will ask a few basic questions, such as the name for the job to be submitted and the number of processes to be used in the job. Simply type
+
 
+
<pre>ADFSubmit</pre>
+
 
+
and answer the questions. The script expects a ADF input file with "file extension" .adf to be present and will do everything else automatically. This is meant for simple ADF job submissions. More complex job submissions are better done manually.
+
 
|}
 
|}
  
 
{|  style="border-spacing: 8px;"
 
{|  style="border-spacing: 8px;"
| valign="top" width="50%" style="padding:1em; border:1px solid #aaaaaa; background-color:#e1eaf1; border-radius:7px" |
+
| valign="top" width="50%" style="padding:1em; border:1px solid #aaaaaa; background-color:#e1eaf1; border-radius:7px" |  
== Licensing ==
+
 
+
ADF is a licensed program. The license held by the Centre for Advanced Computing is limited to our computers at our main site. That means that any of our users can use the program on our machines (but nowhere else), whether they are located at Queen's or not.
+
 
+
We require users of ADF to [http://www.hpcvl.org/sites/default/files/adf-statement.pdf sign a statement] in which they state that they are informed about the [http://www.hpcvl.org/sites/default/files/adf-licence.pdf terms of the license] to be included in the Gaussian user group named "adf". Please fax the completed statement to (613) 533-2015 or scan/email to [mailto:cac.admin@queensu.ca cac.admin@queensu.ca].
+
 
+
 
== Help ==
 
== Help ==
 +
PETSc is a powerful, but complex software package, and requires some practice to be used efficiently. In this FAQ we can not explain its use in any detail.
  
* To learn the basics about Gaussian input and output, refer to the [https://www.scm.com/documentation/ADF/index/ ADF 2016 Manual].  
+
* Start with the [http://www.mcs.anl.gov/petsc/petsc-current/docs/manual.pdf User's Manual].  
* For templates, and to get many examples, check out [https://www.scm.com/documentation/ADF/Examples/Examples/ https://www.scm.com/documentation/ADF/Examples/Examples/].
+
* Further documentation, including Manual Pages, Examples, Course, and Tutorials [http://www.mcs.anl.gov/petsc/petsc-as/documentation/index.html can be found here].  
* The [http://www.gaussian.com Gaussian web page] contains a lot of information.  
+
 
* '''Send [mailto:cac.help@queensu.ca|email to cac.help@queensu.ca]'''. We're happy to help.
 
* '''Send [mailto:cac.help@queensu.ca|email to cac.help@queensu.ca]'''. We're happy to help.
 
|}
 
|}

Latest revision as of 15:00, 14 June 2016

PETSc

This is a short help file on using the scalable suite for the solution of scientific problems based on partial differential equations. The use and application of PETSc requires careful study of the manual. This is meant as a basic introduction to the local usage of PETSc on servers and clusters at the Centre for Advanced Computing.

Features

From the PETSc web page: "PETSc, pronounced PET-see (the S is silent), is a suite of data structures and routines for the scalable (parallel) solution of scientific applications modeled by partial differential equations. It supports MPI, shared memory pthreads, and GPUs through CUDA or OpenCL, as well as hybrid MPI-shared memory pthreads or MPI-GPU parallelism." (Remark: Since we do not operate any GPUs at present, the corresponding features are not implemented in our version).

"PETSc is intended for use in large-scale application projects, and is easy to use for beginners. Its careful design allows advanced users to have detailed control over the solution process. PETSc includes a large suite of parallel linear and non-linear equation solvers and ODE integrators that are easily used in application codes written in C, C++, and Fortran. PETSc provides many of the mechanisms needed within parallel application codes, such as simple parallel matrix and vector assembly routines that allow the overlap of communication and computation. In addition, PETSc includes support for parallel distributed arrays useful for finite difference methods."

Here is a list of features, also from the webpage:

  • Parallel vectors
  • Parallel matrices (several sparse storage formats, easy, efficient assembly)
  • Scalable parallel preconditioners
  • Krylov subspace methods
  • Parallel Newton-based nonlinear solvers
  • Parallel timestepping (ODE) solvers
  • Automatic profiling of floating point and memory usage
  • Consistent user interface
  • Intensive error checking

Location of the program and setup

The present version of PETSc is 3.5.2 (gcc compiler suite, OpenMPI 1.8). The libraries and executables in the PETSc package are located in the directory /opt/petsc. Note that PETSc was compiled with the OpenMPI 1.8 implementation of MPI and should be used together with that implementation. The compilation used the gcc (Gnu C-compiler) version 4.4.7 which is the system compiler. PETSc is likely to work fine with newer version of that compiler but should not be combined with other compilers or MPI implementations on our systems.

It is ** not ** required to sign a statement if you want to use PETSc, as it is distributed under the Gnu Public license. However, this license does not entitle you to use PETSc for commercial purposes.

The setup for PETSc is done, just like with most other software on our systems, through "usepackage". Just issue the command

use petsc

which will set the PETSC_DIR environment variable and add the binary directory to the path. You can do this also manually (for bash):

export PETSC_DIR="/opt/petsc/3.5.2"
export PETSC_ARCH = ""
export PATH="/opt/petsc/3.5.2/bin:"$PATH

Scratch files

One of the settings is the environment variable SCM_TMPDIR which is required to redirect the temporary files that ADF uses to the proper scratch space, presently

/scratch/hpcXXXX

where hpcXXXX stands for your username. If for some reason ADF does not terminate normally (e.g. a job gets cancelled), it leaves behind large scratch files which you may have to delete manually. To check if such files exist, type

ls -lt /scratch/hpcXXXX

Usually the scratch files are in sub-directories that start with kid_. Once you have determined that the scratch files are no longer needed (because the program that used them is not running any more), you can delete them by typing

rm -r /scratch/hpcXXXX/kid_*

Cleaning up the scratch space is the user's responsibility. If it is not done regularly, it can cause jobs to terminate, and much work to be lost.

Writing a PETSc program

PETSc provides a programming framework that lets you solve rather complex scientific problems with a minimum of coding. It is essential to study the User's Manual thoroughly to learn how to write programs that use PETSc.

#include <petscvec.h>

#undef __FUNCT__
#define __FUNCT__ "main"
int main(int argc,char **argv)
{
  PetscErrorCode ierr;
  PetscMPIInt    rank;
  PetscInt       i,N;
  PetscScalar    one = 1.0;
  Vec            x;

  PetscInitialize(&argc,&argv,(char*)0,help);
  ierr = MPI_Comm_rank(PETSC_COMM_WORLD,&rank);CHKERRQ(ierr);

  /* First block */
  ierr = VecCreate(PETSC_COMM_WORLD,&x);CHKERRQ(ierr);
  ierr = VecSetSizes(x,rank+1,PETSC_DECIDE);CHKERRQ(ierr);
  ierr = VecSetFromOptions(x);CHKERRQ(ierr);
  ierr = VecGetSize(x,&N);CHKERRQ(ierr);
  ierr = VecSet(x,one);CHKERRQ(ierr);

  /* Second Block */
  for (i=0; i<N-rank; i++) {
    ierr = VecSetValues(x,1,&i,&one,ADD_VALUES);CHKERRQ(ierr);
  }
  ierr = VecAssemblyBegin(x);CHKERRQ(ierr);
  ierr = VecAssemblyEnd(x);CHKERRQ(ierr);

  /* Third Block */
  ierr = VecView(x,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr);
  ierr = VecDestroy(&x);CHKERRQ(ierr);

  ierr = PetscFinalize();
  return 0;
}

The include statement at the top pulls in the definitions for parallel vector operations. It also implicitly adds basic headers necessary for using PETSc. Many of the variable used (ierr, rank, ...) are declared as PETSc-specific datatypes. These are opaque, i.e. they are used only as arguments in PETSc function calls. Typically, the return value of all functions is an error variable, so that calls have the form

ierr = FunName(args);

First we need to initialize PETSc and determine the "rank", i.e. the number of processors involved. The "First Block" creates a parallel vector x and defines which parts of the vector reside on which process. The PETSc specific communicator PETSC_COMM_WORLD is used for this. The "Second Code Block" then assigns specific values to the elements. This is done on the global version of the vector, and PETSc takes care of the details, for instance which process handles which element. After that, the vector is "assembled", i.e. distributed among the processes. This involves two function calls (Begin and End) to make it possible to "do other things" in the meantime. Just make sure it does not involve the vector. Finally (in the "Third Block"), we print out the vector and get rid of it. After that, we finalize PETSc usage and return.

Compiling a PETSc program

Due to the large number of PETSc specific variables and options, it is best to do compilation using a makefile. This allows to apply all these setting through an "include" statement. Here is a bare-bones makefile without any additional optimization or special options that will compile the above sample program:

CFLAGS =
include ${PETSC_DIR}/conf/variables
include ${PETSC_DIR}/conf/rules

sample : sample.o  chkopts
        -${CLINKER} -o sample sample.o  ${PETSC_VEC_LIB}

The CFLAGS variable allows to pass additional compiler options, but we don't do that in our case. The include statements pull in the necessary options and variables for the compiler and "teaches" the make facility of how to get from PETSc-based C code to properly compiled object files. When executed through the make command, the code is compiled and then linked with the proper libraries:

hasch@swlogin1$ make sample

/opt/openmpi-1.8/bin/mpicc -o sample.o -c -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0  \
-I/opt/petsc/3.5.2/include -I/opt/petsc/3.5.2/include -I/opt/openmpi-1.8/include `pwd`/sample.c

/opt/openmpi-1.8/bin/mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0  -o sample sample.o  \
-Wl,-rpath,/opt/petsc/3.5.2/lib -L/opt/petsc/3.5.2/lib  -lpetsc -Wl,-rpath,/usr/lib64/atlas -L/usr/lib64/atlas -llapack -lf77blas \
-latlas -lX11 -lpthread -lssl -lcrypto -Wl,-rpath,/opt/openmpi-1.8/lib -L/opt/openmpi-1.8/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.7 \
-L/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -lmpi_usempi -lmpi_mpifh -lgfortran -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/opt/openmpi-1.8/lib \
-L/opt/openmpi-1.8/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -Wl,-rpath,/opt/openmpi-1.8/lib \
-Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -lmpi_cxx -lstdc++ -Wl,-rpath,/opt/openmpi-1.8/lib -L/opt/openmpi-1.8/lib \
-Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.7 \
-L/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -ldl -Wl,-rpath,/opt/openmpi-1.8/lib -lmpi -lgcc_s -lpthread -ldl

An impressive list of options is revealed which you really don't want to type in manually. This produces an executable "sample" and an object file "sample.o".

Running a PETSc program

Usually, once a PETSc program has been compiled, it can be executed by the standard mpiexec command. On our systems you have to make sure you are using our OpenMPI 1.8 installation that PETSc was installed with; if not done already, "use openmpi-1.8" should do the trick. Executing the "sample" program in the above example with two processes, we get something like:

$ mpiexec -n 2 ./sample
Vec Object: 2 MPI processes
  type: mpi
Process [0]
3
Process [1]
3
2

The "$" is supposed to be a command prompt. More elegantly, we can add two lines to the makefile, for instance

run :
        ${MPIEXEC} -n 2 ./sample

and execute them through the make command

$ make run
/opt/openmpi-1.8/bin/mpiexec -n 2 ./sample
Vec Object: 2 MPI processes
  type: mpi
Process [0]
3
Process [1]
3
2

In this case we used a PETSc environment variable MPIEXEC which is preset in the include files.

Help

PETSc is a powerful, but complex software package, and requires some practice to be used efficiently. In this FAQ we can not explain its use in any detail.