Read chapters in Numerical Linear Algebra. If you are having trouble understanding the ideas
(and believe me, this is tough stuff) I'd be happy to discuss ideas with you or point you to other references.
Lecture 6: Projectors
Lecture 32: Overview of Iterative Methods
Conduct some hands on tests with PETSc. This material can also be found at /home/hzhang/PetscHandsOn on the
ada.cs.iit.edu machine. Feel free to copy/paste some of these commands as it will help prevent typos. As always,
anywhere you see #userid# you should put your user name.
Earlier we installed PETSc on the Linux machine, and now we need to remind the computer where that is. We can do
that by setting some shell variables (remember not to use spaces next to the equal sign) :
export PETSC_DIR=$HOME/soft/petsc-3.1
export PETSC_ARCH=arch-cs595
We are interested in running some PETSc examples (over the years, PETSc developers have accumulated good examples
of how the library can be used effectively). It is possible for us to go to a bunch of directories and run the
examples there, but we are instead going to move all the examples to a central location to make things simpler.
First let's make a couple directories to put examples in:
mkdir -p $HOME/cs595/petsc-handson/ksp
mkdir -p $HOME/cs595/petsc-handson/snes
and let's move some examples in there along with the necessary makefiles,
cp $PETSC_DIR/src/ksp/ksp/examples/tutorials/makefile $HOME/cs595/petsc-handson/ksp
cp $PETSC_DIR/src/ksp/ksp/examples/tutorials/ex2.c $HOME/cs595/petsc-handson/ksp
cp $PETSC_DIR/src/ksp/ksp/examples/tutorials/ex2f.F $HOME/cs595/petsc-handson/ksp
cp $PETSC_DIR/src/snes/examples/tutorials/makefile $HOME/cs595/petsc-handson/snes
cp $PETSC_DIR/src/snes/examples/tutorials/ex19.c $HOME/cs595/petsc-handson/snes
If you look in the directory
ls $HOME/cs595/petsc-handson
you'll see that there are two folders you have created: ksp and snes.
The ksp folder are examples of linear
problems - ksp is short for Krylov subSPace which is a family popular techniques for solving large sparse linear
systems. These systems are common in discretized PDEs, and KSP methods are the foundation of PETSc. All linear
systems take the form Ax=b, and thus the minimum needed to solve such a problem is a matrix A and a RHS b.
The other folder is labeled snes, and contains examples of nonlinear problems. In general, these problems are
solved with a Newton-like method, and consist of solving many linear systems to approximate the solution to
F(x)=0. Here F is some nonlinear function, which you need to provide to SNES, along with an initial guess for
x. If you know the Jacobian J(F), that will also help, but it is not necessary in PETSc.
Let's start running some examples in the ksp folder
cd $HOME/cs595/petsc-handson/ksp
We also need to make sure that our shell knows how to run programs in parallel. To check this we will ask it
which mpiexec program it wants to run
#userid#@ada:~> which mpiexec
/home/class/fall-10/cs595/#userid#/soft/petsc-3.1/externalpackages/mpich2-1.0.8/bin/mpiexec
If you don't see that you need to add that directory to your PATH variable. You can do that by
export PATH=$PATH:$HOME/soft/petsc-3.1/externalpackages/mpich2-1.0.8/bin
The -help option will give you all the available command line options that
can be passed to this PETSc program. First we need to make the executable, then run it with that option passed
make ex2
mpiexec -n 1 ./ex2 -help |more
or if you want to send all that to a file to be read later (or mover to another location), you can redirect
the output using the > operator
mpiexec -n 1 ./ex2 -help > kspoptions
less kspoptions
We can use the -mat_view_info command to see what's going on in any
assembled PETSc matrix
mpiexec -n 1 ./ex2 -mat_view_info
mpiexec -n 1 ./ex2 -m 100 -n 100 -mat_view_info
If you want you can even print out the contents of your matrices, although this is not recommended for very
large systems
mpiexec -n 1 ./ex2 -mat_view
Depending on the operating system that you are using, it is also possible to look at the sparsity pattern
of the matrix in question. I am working now on finding an easy way for us to render X11 in Windows without
making you guys install additional software. But if you can install a server, or are running UNIX, you can
check out the the X forwarding info on how to display pictures from
PETSc. Once you have X forwarding activated, you can execute -mat_view_draw
mpiexec -n 1 ./ex2 -mat_view_draw -draw_pause -1
When you are debugging your code and trying to determine exactly what you solver is doing, it may be helpful
to use the -ksp_view command. Check out the difference between the
output of this command
mpiexec -n 1 ./ex2 -ksp_view
and this command which runs the same code on 4 processors instead of one.
mpiexec -n 4 ./ex2 -ksp_view
If you are interested in watching how your KSP solver converges, use this option:
mpiexec -n 1 ./ex2 -ksp_monitor
One of the key components in iterative solvers is the use of preconditioners. As you'll learn in class,
the speed with which a KSP method converges is dependent on the condition number of the matrix. To try and
improve the convergence speed, an extra matrix called a preconditioner can be used during the linear
solve. Apply a preconditioner costs time, but it may be worth it if the KSP method converges more quickly.
The default choice of preconditioner for all KSP methods is Block Jacobi with ILU(0) blocks. If you aren't
familiar with that language, it will be covered in class, but basically the idea is to allow each
processor to solve it's own part of the linear system using an incomplete factorization. ... Okay, maybe
that didn't help either, but let's see what happens when we change the default:
mpiexec -n 1 ./ex2 -m 100 -n 100 -ksp_type gmres -pc_type ilu -ksp_max_it 20
mpiexec -n 1 ./ex2 -m 100 -n 100 -ksp_type gmres -pc_type ilu -ksp_max_it 20 -pc_factor_levels 4
The first line executes the code with ILU(0) and the second line considers the same code with ILU(4). In
both cases you can see that the maximum of 20 iterations were conducted (the default max is 10000) and
while neither case converged (convergence in PETSc is determined by relative change between iterations) the
ILU(4) preconditioner was much better than the ILU(0) preconditioner. What do you think that says about
the relative quality of these two preconditioners?
Now let's try a more complicated preconditioner: this will use an Additive Schwarz domain decomposition
preconditioner with a direct solve on each domain. At some point I'm going to put good references up for
all these techniques because it's important to understand - make sure you keep buggin me about that. The
ASM and Block Jacobi are equivalent on 1 processor, so to see a difference we need to run this on multiple
processors:
mpiexec -n 1 ./ex2 -m 100 -n 100 -pc_type asm -sub_pc_type lu -ksp_view
Notice that we did not specify -ksp_type gmres this time; it is the
default KSP solver for PETSc so it isn't necessary. This preconditioner solved the system in only 1
iteration, which is really fast. It was able to do this by conducting a direct solve, and the coupling
between domains was weak enough for this problem to not lose anything with ASM blocking. This will be
discussed in more detail in lectures, but it's good for you to be exposed to it.
Another thing which you should be exposed to is the use of external packages in PETSc. There are many
software packages that PETSc can use to make your life easier, and one of them is called SuperLU. Actually
SuperLU is pretty dated, MUMPS is more common today, but SuperLU is easier to install. If you are so
inclined (this process takes time) you can reconfigure your PETSc installation to have SuperLU installed -
this will give you the ability to directly solve systems in parallel, which PETSc does not provide.
SuperLU itself actually requires other packages as well, and you can get them all by adding the following
steps to your configure command:
--download-superlu --download-parmetis --download-superlu_dist
Return to Homework 2 for a refresher on configure and make. You then need to configure
and make PETSc with the standard configure command plus those options and you will have access to
SuperLU. To activate it at the command line, use
mpiexec -n 2 ./ex2 -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist
Let's move on now to nonlinear solvers. We need to first pop into the directory
cd $HOME/cs595/petsc-handson/snes
and now let's check out what the default setting are for SNES:
mpiexec -n 1 ./ex19 -snes_view
That's a lot of stuff, huh. It's made more complicated because the preconditioner PC here is PETSc's built in
multigrid, which has a lot of complexity to it. That main thing that's really different now is that the top of
the output is not the KSP Object, but instead the SNES Object. It says SNES type is ls, which means Line Search,
another idea I need to add to that list of confusing stuff. Also you see the number of linear solver iterations
and the number of function evaluations - recall that a nonlinear solve consists of multiple linear solves.
If you want to watch the method converge, try running with
mpiexec -n 1 ./ex19 -snes_monitor
If you are not worried about your convergence being too serious, you can change the value at which the solver
declares convergence:
mpiexec -n 1 ./ex19 -snes_atol 1e-4 -snes_monitor
If you are seeing crummy convergence, you may be able to accelerate it by changing the KSP standards inside the
SNES solver. One way to do that intelligently is using the Eisenstat-Walker approach, which can be activated
using -snes_ksp_ew
mpiexec -n 1 ./ex19 -prandtl 50 -grashof 100 -pc_type ilu
mpiexec -n 1 ./ex19 -prandtl 50 -grashof 100 -pc_type ilu -snes_ksp_ew
There are also different types of line search available, although using the default (cubic) is often the best choice.
mpiexec -n 1 ./ex19 -snes_ls basic
One of the most important topics in high performance computing is the cost incurred during communication between
processors. This cost is minimized by optimizing the communications algorithms, which the nice people at PETSc have
done. For a well structured grid, a DA object abstracts all the communication away from you so that you can focus
on your algorithms and let the communication be handled efficiently. Try running these two commands to see the
difference in how the problem is organized when the same problem is handled on 1 or 4 processors.
mpiexec -n 1 ./ex19 -da_view
mpiexec -n 4 ./ex19 -da_view
These may be important in your project depending what your problem is, so keep DA objects in the back of your mind
when you are choosing an approach to solving your problems.
There are many other commands that are incorporated in PETSc which allow the user to interact with the system and
obtain useful information from the library. One of those commands is -log_summary
which provides you with a detailed description of everything that happened during your code's execution. Hop back to
the ksp directory
cd $HOME/cs595/petsc-handson/ksp
and try running the example
mpiexec -n 4 ./ex2 -log_summary |more
Alternatively you can have the summary written to a file:
mpiexec -n 4 ./ex2 -log_summary my_log_summary
If you are having trouble debugging your code you can activate a debugger, although if you're not already comfortable
using a debugger I don't know that I'd recommend it:
mpiexec -n 1 ./ex2 -on_error_attach_debugger gdb
mpiexec -n 1 ./ex2 -start_in_debugger gdb
If you are not comfortable with a debugger, you can also have PETSc print everything that it is doing to a file
mpiexec -n 1 ./ex2 -info whats_goin_on
If you feel like errors in your code are due to memory issues (a common problem in scientific computing as garbage
collection common in C#, Java, Python is not used so as to not slow down code), you can use the
-malloc_dump option
mpiexec -n 1 ./ex2 -malloc_dump
If you are worried that an option you are passing at runtime is not being activated, you can check for misspellings
mpiexec -n 1 ./ex2 -pc_tupe ilu -options_left
mpiexec -n 1 ./ex2 -pc_type ilu -options_left
If you have a boatload of options that are pretty much the same from run to run and you don't want to retype them
everytime, you can put them in a file and tell PETSc to check that file for your options:
echo -pc_type ilu -ksp_type gmres -snes_ksp_ew -malloc_dump -log_summary my.log > all_my_options
mpiexec -n 1 ./ex2 -options_file all_my_options