doc/faq/index.md

9b92b1d3SBarry Smith(doc_faq)=
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith# FAQ
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```{contents} Table Of Contents
9b92b1d3SBarry Smith:backlinks: top
9b92b1d3SBarry Smith:local: true
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith______________________________________________________________________
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith## General
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How can I subscribe to the PETSc mailing lists?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithSee mailing list {ref}`documentation <doc_mail>`
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### Any useful books on numerical computing?
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith[Bueler, PETSc for Partial Differential Equations: Numerical Solutions in C and Python](https://my.siam.org/Store/Product/viewproduct/?ProductId=32850137)
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith[Oliveira and Stewart, Writing Scientific Software: A Guide to Good Style](https://www.cambridge.org/core/books/writing-scientific-software/23206704175AF868E43FE3FB399C2F53)
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith(doc_faq_general_parallel)=
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### What kind of parallel computers or clusters are needed to use PETSc? Or why do I get little speedup?
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith:::{important}
9b92b1d3SBarry SmithPETSc can be used with any kind of parallel system that supports MPI BUT for any decent
9b92b1d3SBarry Smithperformance one needs:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- Fast, **low-latency** interconnect; any ethernet (even 10 GigE) simply cannot provide
9b92b1d3SBarry Smith  the needed performance.
9b92b1d3SBarry Smith- High per-core **memory** performance. Each core needs to
9b92b1d3SBarry Smith  have its **own** memory bandwidth of at least 2 or more gigabytes/second. Most modern
9b92b1d3SBarry Smith  computers are not bottlenecked by how fast they can perform
9b92b1d3SBarry Smith  calculations; rather, they are usually restricted by how quickly they can get their
9b92b1d3SBarry Smith  data.
9b92b1d3SBarry Smith:::
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithTo obtain good performance it is important that you know your machine! I.e. how many
9b92b1d3SBarry Smithcompute nodes (generally, how many motherboards), how many memory sockets per node and how
9b92b1d3SBarry Smithmany cores per memory socket and how much memory bandwidth for each.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithIf you do not know this and can run MPI programs with mpiexec (that is, you don't have
9b92b1d3SBarry Smithbatch system), run the following from `$PETSC_DIR`:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```console
9b92b1d3SBarry Smith$ make streams [NPMAX=maximum_number_of_mpi_processes_you_plan_to_use]
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithThis will provide a summary of the bandwidth with different number of MPI
9b92b1d3SBarry Smithprocesses and potential speedups. See {any}`ch_streams` for a detailed discussion.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithIf you have a batch system:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```console
9b92b1d3SBarry Smith$ cd $PETSC_DIR/src/benchmarks/streams
9b92b1d3SBarry Smith$ make MPIVersion
9b92b1d3SBarry Smithsubmit MPIVersion to the batch system a number of times with 1, 2, 3, etc. MPI processes
9b92b1d3SBarry Smithcollecting all of the output from the runs into the single file scaling.log. Copy
9b92b1d3SBarry Smithscaling.log into the src/benchmarks/streams directory.
9b92b1d3SBarry Smith$ ./process.py createfile ; process.py
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithEven if you have enough memory bandwidth if the OS switches processes between cores
9b92b1d3SBarry Smithperformance can degrade. Smart process to core/socket binding (this just means locking a
9b92b1d3SBarry Smithprocess to a particular core or memory socket) may help you. For example, consider using
9b92b1d3SBarry Smithfewer processes than cores and binding processes to separate sockets so that each process
9b92b1d3SBarry Smithuses a different memory bus:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- [MPICH2 binding with the Hydra process manager](https://github.com/pmodels/mpich/blob/main/doc/wiki/how_to/Using_the_Hydra_Process_Manager.md#process-core-binding]
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith  ```console
9b92b1d3SBarry Smith  $ mpiexec.hydra -n 4 --binding cpu:sockets
9b92b1d3SBarry Smith  ```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- [Open MPI binding](https://www.open-mpi.org/faq/?category=tuning#using-paffinity)
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith  ```console
9b92b1d3SBarry Smith  $ mpiexec -n 4 --map-by socket --bind-to socket --report-bindings
9b92b1d3SBarry Smith  ```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- `taskset`, part of the [util-linux](https://github.com/karelzak/util-linux) package
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith  Check `man taskset` for details. Make sure to set affinity for **your** program,
9b92b1d3SBarry Smith  **not** for the `mpiexec` program.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- `numactl`
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith  In addition to task affinity, this tool also allows changing the default memory affinity
9b92b1d3SBarry Smith  policy. On Linux, the default policy is to attempt to find memory on the same memory bus
9b92b1d3SBarry Smith  that serves the core that a thread is running on when the memory is faulted
9b92b1d3SBarry Smith  (not when `malloc()` is called). If local memory is not available, it is found
9b92b1d3SBarry Smith  elsewhere, possibly leading to serious memory imbalances.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith  The option `--localalloc` allocates memory on the local NUMA node, similar to the
9b92b1d3SBarry Smith  `numa_alloc_local()` function in the `libnuma` library. The option
9b92b1d3SBarry Smith  `--cpunodebind=nodes` binds the process to a given NUMA node (note that this can be
9b92b1d3SBarry Smith  larger or smaller than a CPU (socket); a NUMA node usually has multiple cores).
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith  The option `--physcpubind=cpus` binds the process to a given processor core (numbered
9b92b1d3SBarry Smith  according to `/proc/cpuinfo`, therefore including logical cores if Hyper-threading is
9b92b1d3SBarry Smith  enabled).
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith  With Open MPI, you can use knowledge of the NUMA hierarchy and core numbering on your
9b92b1d3SBarry Smith  machine to calculate the correct NUMA node or processor number given the environment
9b92b1d3SBarry Smith  variable `OMPI_COMM_WORLD_LOCAL_RANK`. In most cases, it is easier to make mpiexec or
9b92b1d3SBarry Smith  a resource manager set affinities.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### What kind of license is PETSc released under?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithSee licensing {ref}`documentation <doc_license>`
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### Why is PETSc written in C, instead of Fortran or C++?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithWhen this decision was made, in the early 1990s, C enabled us to build data structures
9b92b1d3SBarry Smithfor storing sparse matrices, solver information,
9b92b1d3SBarry Smithetc. in ways that Fortran simply did not allow. ANSI C was a complete standard that all
9b92b1d3SBarry Smithmodern C compilers supported. The language was identical on all machines. C++ was still
9b92b1d3SBarry Smithevolving and compilers on different machines were not identical. Using C function pointers
9b92b1d3SBarry Smithto provide data encapsulation and polymorphism allowed us to get many of the advantages of
9b92b1d3SBarry SmithC++ without using such a large and more complicated language. It would have been natural and
9b92b1d3SBarry Smithreasonable to have coded PETSc in C++; we opted to use C instead.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### Does all the PETSc error checking and logging reduce PETSc's efficiency?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithNo
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith(doc_faq_maintenance_strats)=
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How do such a small group of people manage to write and maintain such a large and marvelous package as PETSc?
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith1. **We work very efficiently**.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith   - We use powerful editors and programming environments.
9b92b1d3SBarry Smith   - Our manual pages are generated automatically from formatted comments in the code,
9b92b1d3SBarry Smith     thus alleviating the need for creating and maintaining manual pages.
9b92b1d3SBarry Smith   - We employ continuous integration testing of the entire PETSc library on many different
9b92b1d3SBarry Smith     machine architectures. This process **significantly** protects (no bug-catching
9b92b1d3SBarry Smith     process is perfect) against inadvertently introducing bugs with new additions. Every
9b92b1d3SBarry Smith     new feature **must** pass our suite of thousands of tests as well as formal code
9b92b1d3SBarry Smith     review before it may be included.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith2. **We are very careful in our design (and are constantly revising our design)**
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith   - PETSc as a package should be easy to use, write, and maintain. Our mantra is to write
9b92b1d3SBarry Smith     code like everyone is using it.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith3. **We are willing to do the grunt work**
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith   - PETSc is regularly checked to make sure that all code conforms to our interface
9b92b1d3SBarry Smith     design. We will never keep in a bad design decision simply because changing it will
9b92b1d3SBarry Smith     require a lot of editing; we do a lot of editing.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith4. **We constantly seek out and experiment with new design ideas**
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith   - We retain the useful ones and discard the rest. All of these decisions are based not
9b92b1d3SBarry Smith     just on performance, but also on **practicality**.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith5. **Function and variable names must adhere to strict guidelines**
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith   - Even the rules about capitalization are designed to make it easy to figure out the
9b92b1d3SBarry Smith     name of a particular object or routine. Our memories are terrible, so careful
9b92b1d3SBarry Smith     consistent naming puts less stress on our limited human RAM.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith6. **The PETSc directory tree is designed to make it easy to move throughout the
9b92b1d3SBarry Smith   entire package**
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith7. **We have a rich, robust, and fast bug reporting system**
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith   - <mailto:petsc-maint@mcs.anl.gov> is always checked, and we pride ourselves on responding
9b92b1d3SBarry Smith     quickly and accurately. Email is very lightweight, and so bug reports system retains
9b92b1d3SBarry Smith     an archive of all reported problems and fixes, so it is easy to re-find fixes to
9b92b1d3SBarry Smith     previously discovered problems.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith8. **We contain the complexity of PETSc by using powerful object-oriented programming
9b92b1d3SBarry Smith   techniques**
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith   - Data encapsulation serves to abstract complex data formats or movement to
9b92b1d3SBarry Smith     human-readable format. This is why your program cannot, for example, look directly
9b92b1d3SBarry Smith     at what is inside the object `Mat`.
9b92b1d3SBarry Smith   - Polymorphism makes changing program behavior as easy as possible, and further
9b92b1d3SBarry Smith     abstracts the *intent* of your program from what is *written* in code. You call
9b92b1d3SBarry Smith     `MatMult()` regardless of whether your matrix is dense, sparse, parallel or
9b92b1d3SBarry Smith     sequential; you don't call a different routine for each format.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith9. **We try to provide the functionality requested by our users**
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### For complex numbers will I get better performance with C++?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithTo use PETSc with complex numbers you may use the following `configure` options;
9b92b1d3SBarry Smith`--with-scalar-type=complex` and either `--with-clanguage=c++` or (the default)
9b92b1d3SBarry Smith`--with-clanguage=c`. In our experience they will deliver very similar performance
9b92b1d3SBarry Smith(speed), but if one is concerned they should just try both and see if one is faster.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How come when I run the same program on the same number of processes I get a "different" answer?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithInner products and norms in PETSc are computed using the `MPI_Allreduce()` command. For
9b92b1d3SBarry Smithdifferent runs the order at which values arrive at a given process (via MPI) can be in a
9b92b1d3SBarry Smithdifferent order, thus the order in which some floating point arithmetic operations are
9b92b1d3SBarry Smithperformed will be different. Since floating point arithmetic is not
9b92b1d3SBarry Smithassociative, the computed quantity may be slightly different.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithOver a run the many slight differences in the inner products and norms will effect all the
9b92b1d3SBarry Smithcomputed results. It is important to realize that none of the computed answers are any
9b92b1d3SBarry Smithless right or wrong (in fact the sequential computation is no more right then the parallel
9b92b1d3SBarry Smithones). All answers are equal, but some are more equal than others.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithThe discussion above assumes that the exact same algorithm is being used on the different
9b92b1d3SBarry Smithnumber of processes. When the algorithm is different for the different number of processes
9b92b1d3SBarry Smith(almost all preconditioner algorithms except Jacobi are different for different number of
9b92b1d3SBarry Smithprocesses) then one expects to see (and does) a greater difference in results for
9b92b1d3SBarry Smithdifferent numbers of processes. In some cases (for example block Jacobi preconditioner) it
9b92b1d3SBarry Smithmay be that the algorithm works for some number of processes and does not work for others.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How come when I run the same linear solver on a different number of processes it takes a different number of iterations?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithThe convergence of many of the preconditioners in PETSc including the default parallel
9b92b1d3SBarry Smithpreconditioner block Jacobi depends on the number of processes. The more processes the
9b92b1d3SBarry Smith(slightly) slower convergence it has. This is the nature of iterative solvers, the more
9b92b1d3SBarry Smithparallelism means the more "older" information is used in the solution process hence
9b92b1d3SBarry Smithslower convergence.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith(doc_faq_gpuhowto)=
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### Can PETSc use GPUs to speed up computations?
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith:::{seealso}
9b92b1d3SBarry SmithSee GPU development {ref}`roadmap <doc_gpu_roadmap>` for the latest information
9b92b1d3SBarry Smithregarding the state of PETSc GPU integration.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithSee GPU install {ref}`documentation <doc_config_accel>` for up-to-date information on
9b92b1d3SBarry Smithinstalling PETSc to use GPU's.
9b92b1d3SBarry Smith:::
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithQuick summary of usage with CUDA:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- The `VecType` `VECSEQCUDA`, `VECMPICUDA`, or `VECCUDA` may be used with
9b92b1d3SBarry Smith  `VecSetType()` or `-vec_type seqcuda`, `mpicuda`, or `cuda` when
9b92b1d3SBarry Smith  `VecSetFromOptions()` is used.
9b92b1d3SBarry Smith- The `MatType` `MATSEQAIJCUSPARSE`, `MATMPIAIJCUSPARSE`, or `MATAIJCUSPARSE`
9b92b1d3SBarry Smith  may be used with `MatSetType()` or `-mat_type seqaijcusparse`, `mpiaijcusparse`, or
9b92b1d3SBarry Smith  `aijcusparse` when `MatSetFromOptions()` is used.
9b92b1d3SBarry Smith- If you are creating the vectors and matrices with a `DM`, you can use `-dm_vec_type
9b92b1d3SBarry Smith  cuda` and `-dm_mat_type aijcusparse`.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithQuick summary of usage with OpenCL (provided by the ViennaCL library):
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- The `VecType` `VECSEQVIENNACL`, `VECMPIVIENNACL`, or `VECVIENNACL` may be used
9b92b1d3SBarry Smith  with `VecSetType()` or `-vec_type seqviennacl`, `mpiviennacl`, or `viennacl`
9b92b1d3SBarry Smith  when `VecSetFromOptions()` is used.
9b92b1d3SBarry Smith- The `MatType` `MATSEQAIJVIENNACL`, `MATMPIAIJVIENNACL`, or `MATAIJVIENNACL`
9b92b1d3SBarry Smith  may be used with `MatSetType()` or `-mat_type seqaijviennacl`, `mpiaijviennacl`, or
9b92b1d3SBarry Smith  `aijviennacl` when `MatSetFromOptions()` is used.
9b92b1d3SBarry Smith- If you are creating the vectors and matrices with a `DM`, you can use `-dm_vec_type
9b92b1d3SBarry Smith  viennacl` and `-dm_mat_type aijviennacl`.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithGeneral hints:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- It is useful to develop your code with the default vectors and then run production runs
9b92b1d3SBarry Smith  with the command line options to use the GPU since debugging on GPUs is difficult.
9b92b1d3SBarry Smith- All of the Krylov methods except `KSPIBCGS` run on the GPU.
9b92b1d3SBarry Smith- Parts of most preconditioners run directly on the GPU. After setup, `PCGAMG` runs
9b92b1d3SBarry Smith  fully on GPUs, without any memory copies between the CPU and GPU.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithSome GPU systems (for example many laptops) only run with single precision; thus, PETSc
9b92b1d3SBarry Smithmust be built with the `configure` option `--with-precision=single`.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith(doc_faq_extendedprecision)=
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### Can I run PETSc with extended precision?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithYes, with gcc and gfortran. `configure` PETSc using the
9b92b1d3SBarry Smithoptions `--with-precision=__float128` and `--download-f2cblaslapack`.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith:::{admonition} Warning
9b92b1d3SBarry Smith:class: yellow
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithExternal packages are not guaranteed to work in this mode!
9b92b1d3SBarry Smith:::
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### Why doesn't PETSc use Qd to implement support for extended precision?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithWe tried really hard but could not. The problem is that the QD c++ classes, though they
9b92b1d3SBarry Smithtry to, implement the built-in data types of `double` are not native types and cannot
9b92b1d3SBarry Smith"just be used" in a general piece of numerical source code. Ratherm the code has to
9b92b1d3SBarry Smithrewritten to live within the limitations of QD classes. However PETSc can be built to use
9b92b1d3SBarry Smithquad precision, as detailed {ref}`here <doc_faq_extendedprecision>`.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How do I cite PETSc?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithUse {any}`these citations <doc_index_citing_petsc>`.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith______________________________________________________________________
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith## Installation
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How do I begin using PETSc if the software has already been completely built and installed by someone else?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithAssuming that the PETSc libraries have been successfully built for a particular
9b92b1d3SBarry Smitharchitecture and level of optimization, a new user must merely:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith1. Set `PETSC_DIR` to the full path of the PETSc home
9b92b1d3SBarry Smith   directory. This will be the location of the `configure` script, and usually called
9b92b1d3SBarry Smith   "petsc" or some variation of that (for example, /home/username/petsc).
9b92b1d3SBarry Smith2. Set `PETSC_ARCH`, which indicates the configuration on which PETSc will be
9b92b1d3SBarry Smith   used. Note that `$PETSC_ARCH` is simply a name the installer used when installing
9b92b1d3SBarry Smith   the libraries. There will exist a directory within `$PETSC_DIR` that is named after
9b92b1d3SBarry Smith   its corresponding `$PETSC_ARCH`. There many be several on a single system, for
9b92b1d3SBarry Smith   example "linux-c-debug" for the debug versions compiled by a C compiler or
9b92b1d3SBarry Smith   "linux-c-opt" for the optimized version.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith:::{admonition} Still Stuck?
9b92b1d3SBarry SmithSee the {ref}`quick-start tutorial <tut_install>` for a step-by-step guide on
9b92b1d3SBarry Smithinstalling PETSc, in case you have missed a step.
9b92b1d3SBarry Smith
7f296bb3SBarry SmithSee the users manual section on {ref}`getting started <sec_getting_started>`.
9b92b1d3SBarry Smith:::
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### The PETSc distribution is SO Large. How can I reduce my disk space usage?
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith1. The PETSc users manual is provided in PDF format at `$PETSC_DIR/manual.pdf`. You
9b92b1d3SBarry Smith   can delete this.
9b92b1d3SBarry Smith2. The PETSc test suite contains sample output for many of the examples. These are
9b92b1d3SBarry Smith   contained in the PETSc directories `$PETSC_DIR/src/*/tutorials/output` and
9b92b1d3SBarry Smith   `$PETSC_DIR/src/*/tests/output`. Once you have run the test examples, you may remove
9b92b1d3SBarry Smith   all of these directories to save some disk space. You can locate the largest with
9b92b1d3SBarry Smith   e.g. `find . -name output -type d | xargs du -sh | sort -hr` on a Unix-based system.
9b92b1d3SBarry Smith3. The debugging versions of the libraries are larger than the optimized versions. In a
9b92b1d3SBarry Smith   pinch you can work with the optimized version, although we bid you good luck in
9b92b1d3SBarry Smith   finding bugs as it is much easier with the debug version.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### I want to use PETSc only for uniprocessor programs. Must I still install and use a version of MPI?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithNo, run `configure` with the option `--with-mpi=0`
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### Can I install PETSc to not use X windows (either under Unix or Microsoft Windows with GCC)?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithYes. Run `configure` with the additional flag `--with-x=0`
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### Why do you use MPI?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithMPI is the message-passing standard. Because it is a standard, it will not frequently change over
9b92b1d3SBarry Smithtime; thus, we do not have to change PETSc every time the provider of the message-passing
9b92b1d3SBarry Smithsystem decides to make an interface change. MPI was carefully designed by experts from
9b92b1d3SBarry Smithindustry, academia, and government labs to provide the highest quality performance and
9b92b1d3SBarry Smithcapability.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithFor example, the careful design of communicators in MPI allows the easy nesting of
9b92b1d3SBarry Smithdifferent libraries; no other message-passing system provides this support. All of the
9b92b1d3SBarry Smithmajor parallel computer vendors were involved in the design of MPI and have committed to
9b92b1d3SBarry Smithproviding quality implementations.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithIn addition, since MPI is a standard, several different groups have already provided
9b92b1d3SBarry Smithcomplete free implementations. Thus, one does not have to rely on the technical skills of
9b92b1d3SBarry Smithone particular group to provide the message-passing libraries. Today, MPI is the only
9b92b1d3SBarry Smithpractical, portable approach to writing efficient parallel numerical software.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith(invalid_mpi_compilers)=
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### What do I do if my MPI compiler wrappers are invalid?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithMost MPI implementations provide compiler wrappers (such as `mpicc`) which give the
9b92b1d3SBarry Smithinclude and link options necessary to use that version of MPI to the underlying compilers
9b92b1d3SBarry Smith. Configuration will fail if these wrappers are either absent or broken in the MPI pointed to by
9b92b1d3SBarry Smith`--with-mpi-dir`. You can rerun the configure with the additional option
9b92b1d3SBarry Smith`--with-mpi-compilers=0`, which will try to auto-detect working compilers; however,
9b92b1d3SBarry Smiththese compilers may be incompatible with the particular MPI build. If this fix does not
9b92b1d3SBarry Smithwork, run with `--with-cc=[your_c_compiler]` where you know `your_c_compiler` works
9b92b1d3SBarry Smithwith this particular MPI, and likewise for C++ (`--with-cxx=[your_cxx_compiler]`) and Fortran
9b92b1d3SBarry Smith(`--with-fc=[your_fortran_compiler]`).
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith(bit_indices)=
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### When should/can I use the `configure` option `--with-64-bit-indices`?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithBy default the type that PETSc uses to index into arrays and keep sizes of arrays is a
9b92b1d3SBarry Smith`PetscInt` defined to be a 32-bit `int`. If your problem:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- Involves more than 2^31 - 1 unknowns (around 2 billion).
9b92b1d3SBarry Smith- Your matrix might contain more than 2^31 - 1 nonzeros on a single process.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithThen you need to use this option. Otherwise you will get strange crashes.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithThis option can be used when you are using either 32 or 64-bit pointers. You do not
9b92b1d3SBarry Smithneed to use this option if you are using 64-bit pointers unless the two conditions above
9b92b1d3SBarry Smithhold.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### What if I get an internal compiler error?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithYou can rebuild the offending file individually with a lower optimization level. **Then
9b92b1d3SBarry Smithmake sure to complain to the compiler vendor and file a bug report**. For example, if the
9b92b1d3SBarry Smithcompiler chokes on `src/mat/impls/baij/seq/baijsolvtrannat.c` you can run the following
9b92b1d3SBarry Smithfrom `$PETSC_DIR`:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```console
9b92b1d3SBarry Smith$ make -f gmakefile PCC_FLAGS="-O1" $PETSC_ARCH/obj/src/mat/impls/baij/seq/baijsolvtrannat.o
9b92b1d3SBarry Smith$ make all
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How do I enable Python bindings (petsc4py) with PETSc?
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith1. Install [Cython](https://cython.org/).
9b92b1d3SBarry Smith2. `configure` PETSc with the `--with-petsc4py=1` option.
9b92b1d3SBarry Smith3. set `PYTHONPATH=$PETSC_DIR/$PETSC_ARCH/lib`
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith(macos_gfortran)=
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### What Fortran compiler do you recommend on macOS?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithWe recommend using [homebrew](https://brew.sh/) to install [gfortran](https://gcc.gnu.org/wiki/GFortran), see {any}`doc_macos_install`
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How can I find the URL locations of the packages you install using `--download-PACKAGE`?
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```console
9b92b1d3SBarry Smith$ grep "self.download " $PETSC_DIR/config/BuildSystem/config/packages/*.py
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How to fix the problem: PETSc was configured with one MPICH (or Open MPI) `mpi.h` version but now appears to be compiling using a different MPICH (or Open MPI) `mpi.h` version
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithThis happens for generally one of two reasons:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- You previously ran `configure` with the option `--download-mpich` (or `--download-openmpi`)
9b92b1d3SBarry Smith  but later ran `configure` to use a version of MPI already installed on the
9b92b1d3SBarry Smith  machine. Solution:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith  ```console
9b92b1d3SBarry Smith  $ rm -rf $PETSC_DIR/$PETSC_ARCH
9b92b1d3SBarry Smith  $ ./configure --your-args
9b92b1d3SBarry Smith  ```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith(mpi_network_misconfigure)=
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### What does it mean when `make check` hangs or errors on `PetscOptionsInsertFile()`?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithFor example:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```none
9b92b1d3SBarry SmithPossible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes
9b92b1d3SBarry SmithSee https://petsc.org/release/faq/
9b92b1d3SBarry Smith[0]PETSC ERROR: #1 PetscOptionsInsertFile() line 563 in /Users/barrysmith/Src/PETSc/src/sys/objects/options.c
9b92b1d3SBarry Smith[0]PETSC ERROR: #2 PetscOptionsInsert() line 720 in /Users/barrysmith/Src/PETSc/src/sys/objects/options.c
9b92b1d3SBarry Smith[0]PETSC ERROR: #3 PetscInitialize() line 828 in /Users/barrysmith/Src/PETSc/src/sys/objects/pinit.c
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smithor
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```none
9b92b1d3SBarry Smith$ make check
9b92b1d3SBarry SmithRunning check examples to verify correct installation
9b92b1d3SBarry SmithUsing PETSC_DIR=/Users/barrysmith/Src/petsc and PETSC_ARCH=arch-fix-mpiexec-hang-2-ranks
9b92b1d3SBarry SmithC/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
9b92b1d3SBarry SmithPROGRAM SEEMS TO BE HANGING HERE
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithThis usually occurs when network settings are misconfigured (perhaps due to VPN) resulting in a failure or hang in system call `gethostbyname()`.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- Verify you are using the correct `mpiexec` for the MPI you have linked PETSc with.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- If you have a VPN enabled on your machine, try turning it off and then running `make check` to
9b92b1d3SBarry Smith  verify that it is not the VPN playing poorly with MPI.
9b92b1d3SBarry Smith
7f296bb3SBarry Smith- If  ``ping `hostname` `` (`/sbin/ping` on macOS) fails or hangs do:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith  ```none
9b92b1d3SBarry Smith  echo 127.0.0.1 `hostname` | sudo tee -a /etc/hosts
9b92b1d3SBarry Smith  ```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith  and try `make check` again.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- Try completely disconnecting your machine from the network and see if `make check` then works
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- Try the PETSc `configure` option `--download-mpich-device=ch3:nemesis` with `--download-mpich`.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith______________________________________________________________________
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith## Usage
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How can I redirect PETSc's `stdout` and `stderr` when programming with a GUI interface in Windows Developer Studio or to C++ streams?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithTo overload just the error messages write your own `MyPrintError()` function that does
9b92b1d3SBarry Smithwhatever you want (including pop up windows etc) and use it like below.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```c
9b92b1d3SBarry Smithextern "C" {
9b92b1d3SBarry Smith  int PASCAL WinMain(HINSTANCE,HINSTANCE,LPSTR,int);
9b92b1d3SBarry Smith};
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith#include <petscsys.h>
9b92b1d3SBarry Smith#include <mpi.h>
9b92b1d3SBarry Smith
9b92b1d3SBarry Smithconst char help[] = "Set up from main";
9b92b1d3SBarry Smith
9b92b1d3SBarry Smithint MyPrintError(const char error[], ...)
9b92b1d3SBarry Smith{
9b92b1d3SBarry Smith  printf("%s", error);
9b92b1d3SBarry Smith  return 0;
9b92b1d3SBarry Smith}
9b92b1d3SBarry Smith
9b92b1d3SBarry Smithint main(int ac, char *av[])
9b92b1d3SBarry Smith{
9b92b1d3SBarry Smith  char           buf[256];
9b92b1d3SBarry Smith  HINSTANCE      inst;
9b92b1d3SBarry Smith  PetscErrorCode ierr;
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith  inst = (HINSTANCE)GetModuleHandle(NULL);
9b92b1d3SBarry Smith  PetscErrorPrintf = MyPrintError;
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith  buf[0] = 0;
9b92b1d3SBarry Smith  for (int i = 1; i < ac; ++i) {
9b92b1d3SBarry Smith    strcat(buf, av[i]);
9b92b1d3SBarry Smith    strcat(buf, " ");
9b92b1d3SBarry Smith  }
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith  PetscCall(PetscInitialize(&ac, &av, NULL, help));
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith  return WinMain(inst, NULL, buf, SW_SHOWNORMAL);
9b92b1d3SBarry Smith}
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithPlace this file in the project and compile with this preprocessor definitions:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```
9b92b1d3SBarry SmithWIN32
9b92b1d3SBarry Smith_DEBUG
9b92b1d3SBarry Smith_CONSOLE
9b92b1d3SBarry Smith_MBCS
9b92b1d3SBarry SmithUSE_PETSC_LOG
9b92b1d3SBarry SmithUSE_PETSC_BOPT_g
9b92b1d3SBarry SmithUSE_PETSC_STACK
9b92b1d3SBarry Smith_AFXDLL
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithAnd these link options:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith/nologo
9b92b1d3SBarry Smith/subsystem:console
9b92b1d3SBarry Smith/incremental:yes
9b92b1d3SBarry Smith/debug
9b92b1d3SBarry Smith/machine:I386
9b92b1d3SBarry Smith/nodefaultlib:"libcmtd.lib"
9b92b1d3SBarry Smith/nodefaultlib:"libcd.lib"
9b92b1d3SBarry Smith/nodefaultlib:"mvcrt.lib"
9b92b1d3SBarry Smith/pdbtype:sept
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith:::{note}
9b92b1d3SBarry SmithThe above is compiled and linked as if it was a console program. The linker will search
9b92b1d3SBarry Smithfor a main, and then from it the `WinMain` will start. This works with MFC templates and
9b92b1d3SBarry Smithderived classes too.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithWhen writing a Window's console application you do not need to do anything, the `stdout`
9b92b1d3SBarry Smithand `stderr` is automatically output to the console window.
9b92b1d3SBarry Smith:::
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithTo change where all PETSc `stdout` and `stderr` go, (you can also reassign
9b92b1d3SBarry Smith`PetscVFPrintf()` to handle `stdout` and `stderr` any way you like) write the
9b92b1d3SBarry Smithfollowing function:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```
9b92b1d3SBarry SmithPetscErrorCode mypetscvfprintf(FILE *fd, const char format[], va_list Argp)
9b92b1d3SBarry Smith{
9b92b1d3SBarry Smith  PetscFunctionBegin;
9b92b1d3SBarry Smith  if (fd != stdout && fd != stderr) { /* handle regular files */
9b92b1d3SBarry Smith    PetscCall(PetscVFPrintfDefault(fd, format, Argp));
9b92b1d3SBarry Smith  } else {
9b92b1d3SBarry Smith    char buff[1024]; /* Make sure to assign a large enough buffer */
9b92b1d3SBarry Smith    int  length;
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith    PetscCall(PetscVSNPrintf(buff, 1024, format, &length, Argp));
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith    /* now send buff to whatever stream or whatever you want */
9b92b1d3SBarry Smith  }
9b92b1d3SBarry Smith  PetscFunctionReturn(PETSC_SUCCESS);
9b92b1d3SBarry Smith}
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithThen assign `PetscVFPrintf = mypetscprintf` before `PetscInitialize()` in your main
9b92b1d3SBarry Smithprogram.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### I want to use Hypre boomerAMG without GMRES but when I run `-pc_type hypre -pc_hypre_type boomeramg -ksp_type preonly` I don't get a very accurate answer!
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithYou should run with `-ksp_type richardson` to have PETSc run several V or W
9b92b1d3SBarry Smithcycles. `-ksp_type preonly` causes boomerAMG to use only one V/W cycle. You can control
9b92b1d3SBarry Smithhow many cycles are used in a single application of the boomerAMG preconditioner with
9b92b1d3SBarry Smith`-pc_hypre_boomeramg_max_iter <it>` (the default is 1). You can also control the
9b92b1d3SBarry Smithtolerance boomerAMG uses to decide if to stop before `max_iter` with
9b92b1d3SBarry Smith`-pc_hypre_boomeramg_tol <tol>` (the default is 1.e-7). Run with `-ksp_view` to see
9b92b1d3SBarry Smithall the hypre options used and `-help | grep boomeramg` to see all the command line
9b92b1d3SBarry Smithoptions.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How do I use PETSc for Domain Decomposition?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithPETSc includes Additive Schwarz methods in the suite of preconditioners under the umbrella
9b92b1d3SBarry Smithof `PCASM`. These may be activated with the runtime option `-pc_type asm`. Various
9b92b1d3SBarry Smithother options may be set, including the degree of overlap `-pc_asm_overlap <number>` the
9b92b1d3SBarry Smithtype of restriction/extension `-pc_asm_type [basic,restrict,interpolate,none]` sets ASM
9b92b1d3SBarry Smithtype and several others. You may see the available ASM options by using `-pc_type asm
9b92b1d3SBarry Smith-help`. See the procedural interfaces in the manual pages, for example `PCASMType()`
9b92b1d3SBarry Smithand check the index of the users manual for `PCASMCreateSubdomains()`.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithPETSc also contains a domain decomposition inspired wirebasket or face based two level
9b92b1d3SBarry Smithmethod where the coarse mesh to fine mesh interpolation is defined by solving specific
9b92b1d3SBarry Smithlocal subdomain problems. It currently only works for 3D scalar problems on structured
9b92b1d3SBarry Smithgrids created with PETSc `DMDA`. See the manual page for `PCEXOTIC` and
9b92b1d3SBarry Smith`src/ksp/ksp/tutorials/ex45.c` for an example.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithPETSc also contains a balancing Neumann-Neumann type preconditioner, see the manual page
9b92b1d3SBarry Smithfor `PCBDDC`. This requires matrices be constructed with `MatCreateIS()` via the finite
9b92b1d3SBarry Smithelement method. See `src/ksp/ksp/tests/ex59.c` for an example.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### You have `MATAIJ` and `MATBAIJ` matrix formats, and `MATSBAIJ` for symmetric storage, how come no `MATSAIJ`?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithJust for historical reasons; the `MATSBAIJ` format with blocksize one is just as efficient as
9b92b1d3SBarry Smitha `MATSAIJ` would be.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### Can I Create BAIJ matrices with different size blocks for different block rows?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithNo. The `MATBAIJ` format only supports a single fixed block size on the entire
9b92b1d3SBarry Smithmatrix. But the `MATAIJ` format automatically searches for matching rows and thus still
9b92b1d3SBarry Smithtakes advantage of the natural blocks in your matrix to obtain good performance.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith:::{note}
9b92b1d3SBarry SmithIf you use `MATAIJ` you cannot use the `MatSetValuesBlocked()`.
9b92b1d3SBarry Smith:::
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How do I access the values of a remote parallel PETSc Vec?
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith1. On each process create a local `Vec` large enough to hold all the values it wishes to
9b92b1d3SBarry Smith   access.
9b92b1d3SBarry Smith2. Create a `VecScatter` that scatters from the parallel `Vec` into the local `Vec`.
9b92b1d3SBarry Smith3. Use `VecGetArray()` to access the values in the local `Vec`.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithFor example, assuming we have distributed a vector `vecGlobal` of size $N$ to
9b92b1d3SBarry Smith$R$ ranks and each remote rank holds $N/R = m$ values (similarly assume that
9b92b1d3SBarry Smith$N$ is cleanly divisible by $R$). We want each rank $r$ to gather the
9b92b1d3SBarry Smithfirst $n$ (also assume $n \leq m$) values from its immediately superior neighbor
9b92b1d3SBarry Smith$r+1$ (final rank will retrieve from rank 0).
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```
9b92b1d3SBarry SmithVec            vecLocal;
9b92b1d3SBarry SmithIS             isLocal, isGlobal;
9b92b1d3SBarry SmithVecScatter     ctx;
9b92b1d3SBarry SmithPetscScalar    *arr;
9b92b1d3SBarry SmithPetscInt       N, firstGlobalIndex;
9b92b1d3SBarry SmithMPI_Comm       comm;
9b92b1d3SBarry SmithPetscMPIInt    r, R;
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith/* Create sequential local vector, big enough to hold local portion */
9b92b1d3SBarry SmithPetscCall(VecCreateSeq(PETSC_COMM_SELF, n, &vecLocal));
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith/* Create IS to describe where we want to scatter to */
9b92b1d3SBarry SmithPetscCall(ISCreateStride(PETSC_COMM_SELF, n, 0, 1, &isLocal));
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith/* Compute the global indices */
9b92b1d3SBarry SmithPetscCall(VecGetSize(vecGlobal, &N));
9b92b1d3SBarry SmithPetscCall(PetscObjectGetComm((PetscObject) vecGlobal, &comm));
9b92b1d3SBarry SmithPetscCallMPI(MPI_Comm_rank(comm, &r));
9b92b1d3SBarry SmithPetscCallMPI(MPI_Comm_size(comm, &R));
9b92b1d3SBarry SmithfirstGlobalIndex = r == R-1 ? 0 : (N/R)*(r+1);
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith/* Create IS that describes where we want to scatter from */
9b92b1d3SBarry SmithPetscCall(ISCreateStride(comm, n, firstGlobalIndex, 1, &isGlobal));
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith/* Create the VecScatter context */
9b92b1d3SBarry SmithPetscCall(VecScatterCreate(vecGlobal, isGlobal, vecLocal, isLocal, &ctx));
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith/* Gather the values */
9b92b1d3SBarry SmithPetscCall(VecScatterBegin(ctx, vecGlobal, vecLocal, INSERT_VALUES, SCATTER_FORWARD));
9b92b1d3SBarry SmithPetscCall(VecScatterEnd(ctx, vecGlobal, vecLocal, INSERT_VALUES, SCATTER_FORWARD));
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith/* Retrieve and do work */
9b92b1d3SBarry SmithPetscCall(VecGetArray(vecLocal, &arr));
9b92b1d3SBarry Smith/* Work */
9b92b1d3SBarry SmithPetscCall(VecRestoreArray(vecLocal, &arr));
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith/* Don't forget to clean up */
9b92b1d3SBarry SmithPetscCall(ISDestroy(&isLocal));
9b92b1d3SBarry SmithPetscCall(ISDestroy(&isGlobal));
9b92b1d3SBarry SmithPetscCall(VecScatterDestroy(&ctx));
9b92b1d3SBarry SmithPetscCall(VecDestroy(&vecLocal));
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith(doc_faq_usage_alltoone)=
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How do I collect to a single processor all the values from a parallel PETSc Vec?
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith1. Create the `VecScatter` context that will do the communication:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith   ```
9b92b1d3SBarry Smith   Vec        in_par,out_seq;
9b92b1d3SBarry Smith   VecScatter ctx;
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith   PetscCall(VecScatterCreateToAll(in_par, &ctx, &out_seq));
9b92b1d3SBarry Smith   ```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith2. Initiate the communication (this may be repeated if you wish):
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith   ```
9b92b1d3SBarry Smith   PetscCall(VecScatterBegin(ctx, in_par, out_seq, INSERT_VALUES, SCATTER_FORWARD));
9b92b1d3SBarry Smith   PetscCall(VecScatterEnd(ctx, in_par, out_seq, INSERT_VALUES, SCATTER_FORWARD));
9b92b1d3SBarry Smith   /* May destroy context now if no additional scatters are needed, otherwise reuse it */
9b92b1d3SBarry Smith   PetscCall(VecScatterDestroy(&ctx));
9b92b1d3SBarry Smith   ```
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithNote that this simply concatenates in the parallel ordering of the vector (computed by the
9b92b1d3SBarry Smith`MPI_Comm_rank` of the owning process). If you are using a `Vec` from
9b92b1d3SBarry Smith`DMCreateGlobalVector()` you likely want to first call `DMDAGlobalToNaturalBegin()`
9b92b1d3SBarry Smithfollowed by `DMDAGlobalToNaturalEnd()` to scatter the original `Vec` into the natural
9b92b1d3SBarry Smithordering in a new global `Vec` before calling `VecScatterBegin()`/`VecScatterEnd()`
9b92b1d3SBarry Smithto scatter the natural `Vec` onto all processes.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How do I collect all the values from a parallel PETSc Vec on the 0th rank?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithSee FAQ entry on collecting to {ref}`an arbitrary processor <doc_faq_usage_alltoone>`, but
9b92b1d3SBarry Smithreplace
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```
9b92b1d3SBarry SmithPetscCall(VecScatterCreateToAll(in_par, &ctx, &out_seq));
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smithwith
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```
9b92b1d3SBarry SmithPetscCall(VecScatterCreateToZero(in_par, &ctx, &out_seq));
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith:::{note}
9b92b1d3SBarry SmithThe same ordering considerations as discussed in the aforementioned entry also apply
9b92b1d3SBarry Smithhere.
9b92b1d3SBarry Smith:::
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How can I read in or write out a sparse matrix in Matrix Market, Harwell-Boeing, Slapc or other ASCII format?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithIf you can read or write your matrix using Python or MATLAB/Octave, `PetscBinaryIO`
9b92b1d3SBarry Smithmodules are provided at `$PETSC_DIR/lib/petsc/bin` for each language that can assist
9b92b1d3SBarry Smithwith reading and writing. If you just want to convert `MatrixMarket`, you can use:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```console
9b92b1d3SBarry Smith$ python -m $PETSC_DIR/lib/petsc/bin/PetscBinaryIO convert matrix.mtx
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithTo produce `matrix.petsc`.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithYou can also call the script directly or import it from your Python code. There is also a
9b92b1d3SBarry Smith`PETScBinaryIO.jl` Julia package.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithFor other formats, either adapt one of the above libraries or see the examples in
9b92b1d3SBarry Smith`$PETSC_DIR/src/mat/tests`, specifically `ex72.c` or `ex78.c`. You will likely need
9b92b1d3SBarry Smithto modify the code slightly to match your required ASCII format.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith:::{note}
9b92b1d3SBarry SmithNever read or write in parallel an ASCII matrix file.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithInstead read in sequentially with a standalone code based on `ex72.c` or `ex78.c`
9b92b1d3SBarry Smiththen save the matrix with the binary viewer `PetscViewerBinaryOpen()` and load the
9b92b1d3SBarry Smithmatrix in parallel in your "real" PETSc program with `MatLoad()`.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithFor writing save with the binary viewer and then load with the sequential code to store
9b92b1d3SBarry Smithit as ASCII.
9b92b1d3SBarry Smith:::
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### Does TSSetFromOptions(), SNESSetFromOptions(), or KSPSetFromOptions() reset all the parameters I previously set or how come do they not seem to work?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithIf `XXSetFromOptions()` is used (with `-xxx_type aaaa`) to change the type of the
9b92b1d3SBarry Smithobject then all parameters associated with the previous type are removed. Otherwise it
9b92b1d3SBarry Smithdoes not reset parameters.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith`TS/SNES/KSPSetXXX()` commands that set properties for a particular type of object (such
9b92b1d3SBarry Smithas `KSPGMRESSetRestart()`) ONLY work if the object is ALREADY of that type. For example,
9b92b1d3SBarry Smithwith
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```
9b92b1d3SBarry SmithKSP ksp;
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithPetscCall(KSPCreate(PETSC_COMM_WORLD, &ksp));
9b92b1d3SBarry SmithPetscCall(KSPGMRESSetRestart(ksp, 10));
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smiththe restart will be ignored since the type has not yet been set to `KSPGMRES`. To have
9b92b1d3SBarry Smiththose values take effect you should do one of the following:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- Allow setting the type from the command line, if it is not on the command line then the
9b92b1d3SBarry Smith  default type is automatically set.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith/* Create generic object */
9b92b1d3SBarry SmithXXXCreate(..,&obj);
9b92b1d3SBarry Smith/* Must set all settings here, or default */
9b92b1d3SBarry SmithXXXSetFromOptions(obj);
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- Hardwire the type in the code, but allow the user to override it via a subsequent
9b92b1d3SBarry Smith  `XXXSetFromOptions()` call. This essentially allows the user to customize what the
9b92b1d3SBarry Smith  "default" type to of the object.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith/* Create generic object */
9b92b1d3SBarry SmithXXXCreate(..,&obj);
9b92b1d3SBarry Smith/* Set type directly */
9b92b1d3SBarry SmithXXXSetYYYYY(obj,...);
9b92b1d3SBarry Smith/* Can always change to different type */
9b92b1d3SBarry SmithXXXSetFromOptions(obj);
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How do I compile and link my own PETSc application codes and can I use my own `makefile` or rules for compiling code, rather than PETSc's?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithSee the {ref}`section <sec_writing_application_codes>` of the users manual on writing
9b92b1d3SBarry Smithapplication codes with PETSc.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### Can I use CMake to build my own project that depends on PETSc?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithSee the {ref}`section <sec_writing_application_codes>` of the users manual on writing
9b92b1d3SBarry Smithapplication codes with PETSc.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How can I put carriage returns in `PetscPrintf()` statements from Fortran?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithYou can use the same notation as in C, just put a `\n` in the string. Note that no other C
9b92b1d3SBarry Smithformat instruction is supported.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithOr you can use the Fortran concatenation `//` and `char(10)`; for example `'some
9b92b1d3SBarry Smithstring'//char(10)//'another string` on the next line.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How can I implement callbacks using C++ class methods?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithDeclare the class method static. Static methods do not have a `this` pointer, but the
9b92b1d3SBarry Smith`void*` context parameter will usually be cast to a pointer to the class where it can
9b92b1d3SBarry Smithserve the same function.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith:::{admonition} Remember
9b92b1d3SBarry SmithAll PETSc callbacks return `PetscErrorCode`.
9b92b1d3SBarry Smith:::
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### Everyone knows that when you code Newton's Method you should compute the function and its Jacobian at the same time. How can one do this in PETSc?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithThe update in Newton's method is computed as
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith$$
9b92b1d3SBarry Smithu^{n+1} = u^n - \lambda * \left[J(u^n)] * F(u^n) \right]^{\dagger}
9b92b1d3SBarry Smith$$
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithThe reason PETSc doesn't default to computing both the function and Jacobian at the same
9b92b1d3SBarry Smithtime is:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- In order to do the line search $F \left(u^n - \lambda * \text{step} \right)$ may
9b92b1d3SBarry Smith  need to be computed for several $\lambda$. The Jacobian is not needed for each of
9b92b1d3SBarry Smith  those and one does not know in advance which will be the final $\lambda$ until
9b92b1d3SBarry Smith  after the function value is computed, so many extra Jacobians may be computed.
9b92b1d3SBarry Smith- In the final step if $|| F(u^p)||$ satisfies the convergence criteria then a
9b92b1d3SBarry Smith  Jacobian need not be computed.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithYou are free to have your `FormFunction()` compute as much of the Jacobian at that point
9b92b1d3SBarry Smithas you like, keep the information in the user context (the final argument to
9b92b1d3SBarry Smith`FormFunction()` and `FormJacobian()`) and then retrieve the information in your
9b92b1d3SBarry Smith`FormJacobian()` function.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### Computing the Jacobian or preconditioner is time consuming. Is there any way to compute it less often?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithPETSc has a variety of ways of lagging the computation of the Jacobian or the
9b92b1d3SBarry Smithpreconditioner. They are documented in the manual page for `SNESComputeJacobian()`
9b92b1d3SBarry Smithand in the {ref}`users manual <ch_snes>`:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith-s
9b92b1d3SBarry Smith
9b92b1d3SBarry Smithnes_lag_jacobian
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith(`SNESSetLagJacobian()`) How often Jacobian is rebuilt (use -1 to
9b92b1d3SBarry Smithnever rebuild, use -2 to rebuild the next time requested and then
9b92b1d3SBarry Smithnever again).
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith-s
9b92b1d3SBarry Smith
9b92b1d3SBarry Smithnes_lag_jacobian_persists
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith(`SNESSetLagJacobianPersists()`) Forces lagging of Jacobian
9b92b1d3SBarry Smiththrough multiple `SNES` solves , same as passing -2 to
9b92b1d3SBarry Smith`-snes_lag_jacobian`. By default, each new `SNES` solve
9b92b1d3SBarry Smithnormally triggers a recomputation of the Jacobian.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith-s
9b92b1d3SBarry Smith
9b92b1d3SBarry Smithnes_lag_preconditioner
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith(`SNESSetLagPreconditioner()`) how often the preconditioner is
9b92b1d3SBarry Smithrebuilt. Note: if you are lagging the Jacobian the system will
9b92b1d3SBarry Smithknow that the matrix has not changed and will not recompute the
9b92b1d3SBarry Smith(same) preconditioner.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith-s
9b92b1d3SBarry Smith
9b92b1d3SBarry Smithnes_lag_preconditioner_persists
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith(`SNESSetLagPreconditionerPersists()`) Preconditioner
9b92b1d3SBarry Smithlags through multiple `SNES` solves
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith:::{note}
9b92b1d3SBarry SmithThese are often (but does not need to be) used in combination with
9b92b1d3SBarry Smith`-snes_mf_operator` which applies the fresh Jacobian matrix-free for every
9b92b1d3SBarry Smithmatrix-vector product. Otherwise the out-of-date matrix vector product, computed with
9b92b1d3SBarry Smiththe lagged Jacobian will be used.
9b92b1d3SBarry Smith:::
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithBy using `KSPMonitorSet()` and/or `SNESMonitorSet()` one can provide code that monitors the
9b92b1d3SBarry Smithconvergence rate and automatically triggers an update of the Jacobian or preconditioner
9b92b1d3SBarry Smithbased on decreasing convergence of the iterative method. For example if the number of `SNES`
9b92b1d3SBarry Smithiterations doubles one might trigger a new computation of the Jacobian. Experimentation is
9b92b1d3SBarry Smiththe only general purpose way to determine which approach is best for your problem.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith:::{important}
9b92b1d3SBarry SmithIt is also vital to experiment on your true problem at the scale you will be solving
9b92b1d3SBarry Smiththe problem since the performance benefits depend on the exact problem and the problem
9b92b1d3SBarry Smithsize!
9b92b1d3SBarry Smith:::
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How can I use Newton's Method Jacobian free? Can I difference a different function than provided with `SNESSetFunction()`?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithThe simplest way is with the option `-snes_mf`, this will use finite differencing of the
9b92b1d3SBarry Smithfunction provided to `SNESComputeFunction()` to approximate the action of Jacobian.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith:::{important}
9b92b1d3SBarry SmithSince no matrix-representation of the Jacobian is provided the `-pc_type` used with
9b92b1d3SBarry Smiththis option must be `-pc_type none`. You may provide a custom preconditioner with
9b92b1d3SBarry Smith`SNESGetKSP()`, `KSPGetPC()`, and `PCSetType()` and use `PCSHELL`.
9b92b1d3SBarry Smith:::
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithThe option `-snes_mf_operator` will use Jacobian free to apply the Jacobian (in the
9b92b1d3SBarry SmithKrylov solvers) but will use whatever matrix you provided with `SNESSetJacobian()`
9b92b1d3SBarry Smith(assuming you set one) to compute the preconditioner.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithTo write the code (rather than use the options above) use `MatCreateSNESMF()` and pass
9b92b1d3SBarry Smiththe resulting matrix object to `SNESSetJacobian()`.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithFor purely matrix-free (like `-snes_mf`) pass the matrix object for both matrix
9b92b1d3SBarry Smitharguments and pass the function `MatMFFDComputeJacobian()`.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithTo provide your own approximate Jacobian matrix to compute the preconditioner (like
9b92b1d3SBarry Smith`-snes_mf_operator`), pass this other matrix as the second matrix argument to
9b92b1d3SBarry Smith`SNESSetJacobian()`. Make sure your provided `computejacobian()` function calls
9b92b1d3SBarry Smith`MatAssemblyBegin()` and `MatAssemblyEnd()` separately on **BOTH** matrix arguments
9b92b1d3SBarry Smithto this function. See `src/snes/tests/ex7.c`.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithTo difference a different function than that passed to `SNESSetJacobian()` to compute the
9b92b1d3SBarry Smithmatrix-free Jacobian multiply call `MatMFFDSetFunction()` to set that other function. See
9b92b1d3SBarry Smith`src/snes/tests/ex7.c.h`.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith(doc_faq_usage_condnum)=
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How can I determine the condition number of a matrix?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithFor small matrices, the condition number can be reliably computed using
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```text
9b92b1d3SBarry Smith-pc_type svd -pc_svd_monitor
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithFor larger matrices, you can run with
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```text
9b92b1d3SBarry Smith-pc_type none -ksp_type gmres -ksp_monitor_singular_value -ksp_gmres_restart 1000
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smithto get approximations to the condition number of the operator. This will generally be
9b92b1d3SBarry Smithaccurate for the largest singular values, but may overestimate the smallest singular value
9b92b1d3SBarry Smithunless the method has converged. Make sure to avoid restarts. To estimate the condition
9b92b1d3SBarry Smithnumber of the preconditioned operator, use `-pc_type somepc` in the last command.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithYou can use [SLEPc](https://slepc.upv.es) for highly scalable, efficient, and quality eigenvalue computations.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How can I compute the inverse of a matrix in PETSc?
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith:::{admonition} Are you sure?
9b92b1d3SBarry Smith:class: yellow
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithIt is very expensive to compute the inverse of a matrix and very rarely needed in
9b92b1d3SBarry Smithpractice. We highly recommend avoiding algorithms that need it.
9b92b1d3SBarry Smith:::
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithThe inverse of a matrix (dense or sparse) is essentially always dense, so begin by
9b92b1d3SBarry Smithcreating a dense matrix B and fill it with the identity matrix (ones along the diagonal),
9b92b1d3SBarry Smithalso create a dense matrix X of the same size that will hold the solution. Then factor the
9b92b1d3SBarry Smithmatrix you wish to invert with `MatLUFactor()` or `MatCholeskyFactor()`, call the
9b92b1d3SBarry Smithresult A. Then call `MatMatSolve(A,B,X)` to compute the inverse into X. See also section
9b92b1d3SBarry Smithon {any}`Schur's complement <how_can_i_compute_the_schur_complement>`.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith(how_can_i_compute_the_schur_complement)=
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How can I compute the Schur complement in PETSc?
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith:::{admonition} Are you sure?
9b92b1d3SBarry Smith:class: yellow
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithIt is very expensive to compute the Schur complement of a matrix and very rarely needed
9b92b1d3SBarry Smithin practice. We highly recommend avoiding algorithms that need it.
9b92b1d3SBarry Smith:::
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithThe Schur complement of the matrix $M \in \mathbb{R}^{\left(p+q \right) \times
9b92b1d3SBarry Smith\left(p + q \right)}$
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith$$
9b92b1d3SBarry SmithM = \begin{bmatrix}
9b92b1d3SBarry SmithA & B \\
9b92b1d3SBarry SmithC & D
9b92b1d3SBarry Smith\end{bmatrix}
9b92b1d3SBarry Smith$$
9b92b1d3SBarry Smith
9b92b1d3SBarry Smithwhere
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith$$
9b92b1d3SBarry SmithA \in \mathbb{R}^{p \times p}, \quad B \in \mathbb{R}^{p \times q}, \quad C \in \mathbb{R}^{q \times p}, \quad D \in \mathbb{R}^{q \times q}
9b92b1d3SBarry Smith$$
9b92b1d3SBarry Smith
9b92b1d3SBarry Smithis given by
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith$$
9b92b1d3SBarry SmithS_D := A - BD^{-1}C \\
9b92b1d3SBarry SmithS_A := D - CA^{-1}B
9b92b1d3SBarry Smith$$
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithLike the inverse, the Schur complement of a matrix (dense or sparse) is essentially always
9b92b1d3SBarry Smithdense, so assuming you wish to calculate $S_A = D - C \underbrace{
9b92b1d3SBarry Smith\overbrace{(A^{-1})}^{U} B}_{V}$ begin by:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith1. Forming a dense matrix $B$
9b92b1d3SBarry Smith2. Also create another dense matrix $V$ of the same size.
9b92b1d3SBarry Smith3. Then either factor the matrix $A$ directly with `MatLUFactor()` or
9b92b1d3SBarry Smith   `MatCholeskyFactor()`, or use `MatGetFactor()` followed by
9b92b1d3SBarry Smith   `MatLUFactorSymbolic()` followed by `MatLUFactorNumeric()` if you wish to use and
9b92b1d3SBarry Smith   external solver package like SuperLU_Dist. Call the result $U$.
9b92b1d3SBarry Smith4. Then call `MatMatSolve(U,B,V)`.
9b92b1d3SBarry Smith5. Then call `MatMatMult(C,V,MAT_INITIAL_MATRIX,1.0,&S)`.
9b92b1d3SBarry Smith6. Now call `MatAXPY(S,-1.0,D,MAT_SUBSET_NONZERO)`.
9b92b1d3SBarry Smith7. Followed by `MatScale(S,-1.0)`.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithFor computing Schur complements like this it does not make sense to use the `KSP`
9b92b1d3SBarry Smithiterative solvers since for solving many moderate size problems using a direct
9b92b1d3SBarry Smithfactorization is much faster than iterative solvers. As you can see, this requires a great
9b92b1d3SBarry Smithdeal of work space and computation so is best avoided.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithHowever, it is not necessary to assemble the Schur complement $S$ in order to solve
9b92b1d3SBarry Smithsystems with it. Use `MatCreateSchurComplement(A,A_pre,B,C,D,&S)` to create a
9b92b1d3SBarry Smithmatrix that applies the action of $S$ (using `A_pre` to solve with `A`), but
9b92b1d3SBarry Smithdoes not assemble.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithAlternatively, if you already have a block matrix `M = [A, B; C, D]` (in some
9b92b1d3SBarry Smithordering), then you can create index sets (`IS`) `isa` and `isb` to address each
9b92b1d3SBarry Smithblock, then use `MatGetSchurComplement()` to create the Schur complement and/or an
9b92b1d3SBarry Smithapproximation suitable for preconditioning.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithSince $S$ is generally dense, standard preconditioning methods cannot typically be
9b92b1d3SBarry Smithapplied directly to Schur complements. There are many approaches to preconditioning Schur
9b92b1d3SBarry Smithcomplements including using the `SIMPLE` approximation
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith$$
9b92b1d3SBarry SmithD - C \text{diag}(A)^{-1} B
9b92b1d3SBarry Smith$$
9b92b1d3SBarry Smith
9b92b1d3SBarry Smithto create a sparse matrix that approximates the Schur complement (this is returned by
9b92b1d3SBarry Smithdefault for the optional "preconditioning" matrix in `MatGetSchurComplement()`).
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithAn alternative is to interpret the matrices as differential operators and apply
9b92b1d3SBarry Smithapproximate commutator arguments to find a spectrally equivalent operation that can be
9b92b1d3SBarry Smithapplied efficiently (see the "PCD" preconditioners {cite}`elman_silvester_wathen_2014`). A
9b92b1d3SBarry Smithvariant of this is the least squares commutator, which is closely related to the
9b92b1d3SBarry SmithMoore-Penrose pseudoinverse, and is available in `PCLSC` which operates on matrices of
9b92b1d3SBarry Smithtype `MATSCHURCOMPLEMENT`.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### Do you have examples of doing unstructured grid Finite Element Method (FEM) with PETSc?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithThere are at least three ways to write finite element codes using PETSc:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith1. Use `DMPLEX`, which is a high level approach to manage your mesh and
9b92b1d3SBarry Smith   discretization. See the {ref}`tutorials sections <tut_stokes>` for further information,
9b92b1d3SBarry Smith   or see `src/snes/tutorial/ex62.c`.
9b92b1d3SBarry Smith2. Use packages such as [deal.ii](https://www.dealii.org), [libMesh](https://libmesh.github.io), or
9b92b1d3SBarry Smith   [Firedrake](https://www.firedrakeproject.org), which use PETSc for the solvers.
9b92b1d3SBarry Smith3. Manage the grid data structure yourself and use PETSc `PetscSF`, `IS`, and `VecScatter` to
9b92b1d3SBarry Smith   communicate the required ghost point communication. See
9b92b1d3SBarry Smith   `src/snes/tutorials/ex10d/ex10.c`.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### DMDA decomposes the domain differently than the MPI_Cart_create() command. How can one use them together?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithThe `MPI_Cart_create()` first divides the mesh along the z direction, then the y, then
9b92b1d3SBarry Smiththe x. `DMDA` divides along the x, then y, then z. Thus, for example, rank 1 of the
9b92b1d3SBarry Smithprocesses will be in a different part of the mesh for the two schemes. To resolve this you
9b92b1d3SBarry Smithcan create a new MPI communicator that you pass to `DMDACreate()` that renumbers the
9b92b1d3SBarry Smithprocess ranks so that each physical process shares the same part of the mesh with both the
9b92b1d3SBarry Smith`DMDA` and the `MPI_Cart_create()`. The code to determine the new numbering was
9b92b1d3SBarry Smithprovided by Rolf Kuiper:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith// the numbers of processors per direction are (int) x_procs, y_procs, z_procs respectively
9b92b1d3SBarry Smith// (no parallelization in direction 'dir' means dir_procs = 1)
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithMPI_Comm    NewComm;
9b92b1d3SBarry Smithint         x, y, z;
9b92b1d3SBarry SmithPetscMPIInt MPI_Rank, NewRank;
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith// get rank from MPI ordering:
9b92b1d3SBarry SmithPetscCallMPI(MPI_Comm_rank(MPI_COMM_WORLD, &MPI_Rank));
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith// calculate coordinates of cpus in MPI ordering:
9b92b1d3SBarry Smithx = MPI_rank / (z_procs*y_procs);
9b92b1d3SBarry Smithy = (MPI_rank % (z_procs*y_procs)) / z_procs;
9b92b1d3SBarry Smithz = (MPI_rank % (z_procs*y_procs)) % z_procs;
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith// set new rank according to PETSc ordering:
9b92b1d3SBarry SmithNewRank = z*y_procs*x_procs + y*x_procs + x;
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith// create communicator with new ranks according to PETSc ordering
9b92b1d3SBarry SmithPetscCallMPI(MPI_Comm_split(PETSC_COMM_WORLD, 1, NewRank, &NewComm));
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith// override the default communicator (was MPI_COMM_WORLD as default)
9b92b1d3SBarry SmithPETSC_COMM_WORLD = NewComm;
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### When solving a system with Dirichlet boundary conditions I can use MatZeroRows() to eliminate the Dirichlet rows but this results in a non-Symmetric system. How can I apply Dirichlet boundary conditions but keep the matrix symmetric?
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- For nonsymmetric systems put the appropriate boundary solutions in the x vector and use
9b92b1d3SBarry Smith  `MatZeroRows()` followed by `KSPSetOperators()`.
9b92b1d3SBarry Smith- For symmetric problems use `MatZeroRowsColumns()`.
9b92b1d3SBarry Smith- If you have many Dirichlet locations you can use `MatZeroRows()` (**not**
9b92b1d3SBarry Smith  `MatZeroRowsColumns()`) and `-ksp_type preonly -pc_type redistribute` (see
9b92b1d3SBarry Smith  `PCREDISTRIBUTE`) and PETSc will repartition the parallel matrix for load
9b92b1d3SBarry Smith  balancing. In this case the new matrix solved remains symmetric even though
9b92b1d3SBarry Smith  `MatZeroRows()` is used.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithAn alternative approach is, when assembling the matrix (generating values and passing
9b92b1d3SBarry Smiththem to the matrix), never to include locations for the Dirichlet grid points in the
9b92b1d3SBarry Smithvector and matrix, instead taking them into account as you put the other values into the
9b92b1d3SBarry Smithload.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How can I get PETSc vectors and matrices to MATLAB or vice versa?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithThere are numerous ways to work with PETSc and MATLAB. All but the first approach
9b92b1d3SBarry Smithrequire PETSc to be configured with --with-matlab.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith1. To save PETSc `Mat` and `Vec` to files that can be read from MATLAB use
9b92b1d3SBarry Smith   `PetscViewerBinaryOpen()` viewer and `VecView()` or `MatView()` to save objects
9b92b1d3SBarry Smith   for MATLAB and `VecLoad()` and `MatLoad()` to get the objects that MATLAB has
9b92b1d3SBarry Smith   saved. See `share/petsc/matlab/PetscBinaryRead.m` and
9b92b1d3SBarry Smith   `share/petsc/matlab/PetscBinaryWrite.m` for loading and saving the objects in MATLAB.
9b92b1d3SBarry Smith2. Using the [MATLAB Engine](https://www.mathworks.com/help/matlab/calling-matlab-engine-from-c-programs-1.html),
9b92b1d3SBarry Smith   allows PETSc to automatically call MATLAB to perform some specific computations. This
9b92b1d3SBarry Smith   does not allow MATLAB to be used interactively by the user. See the
9b92b1d3SBarry Smith   `PetscMatlabEngine`.
9b92b1d3SBarry Smith3. You can open a socket connection between MATLAB and PETSc to allow sending objects back
9b92b1d3SBarry Smith   and forth between an interactive MATLAB session and a running PETSc program. See
9b92b1d3SBarry Smith   `PetscViewerSocketOpen()` for access from the PETSc side and
9b92b1d3SBarry Smith   `share/petsc/matlab/PetscReadBinary.m` for access from the MATLAB side.
9b92b1d3SBarry Smith4. You can save PETSc `Vec` (**not** `Mat`) with the `PetscViewerMatlabOpen()`
9b92b1d3SBarry Smith   viewer that saves `.mat` files can then be loaded into MATLAB using the `load()` command
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How do I get started with Cython so that I can extend petsc4py?
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith1. Learn how to [build a Cython module](http://docs.cython.org/src/quickstart/build.html).
9b92b1d3SBarry Smith2. Go through the simple [example](https://stackoverflow.com/questions/3046305/simple-wrapping-of-c-code-with-cython). Note
9b92b1d3SBarry Smith   also the next comment that shows how to create numpy arrays in the Cython and pass them
9b92b1d3SBarry Smith   back.
9b92b1d3SBarry Smith3. Check out [this page](http://docs.cython.org/src/tutorial/numpy.html) which tells
9b92b1d3SBarry Smith   you how to get fast indexing.
9b92b1d3SBarry Smith4. Have a look at the petsc4py [array source](http://code.google.com/p/petsc4py/source/browse/src/PETSc/arraynpy.pxi).
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How do I compute a custom norm for KSP to use as a convergence test or for monitoring?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithYou need to call `KSPBuildResidual()` on your `KSP` object and then compute the
9b92b1d3SBarry Smithappropriate norm on the resulting residual. Note that depending on the
9b92b1d3SBarry Smith`KSPSetNormType()` of the method you may not return the same norm as provided by the
9b92b1d3SBarry Smithmethod. See also `KSPSetPCSide()`.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### If I have a sequential program can I use a PETSc parallel solver?
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith:::{important}
9b92b1d3SBarry SmithDo not expect to get great speedups! Much of the speedup gained by using parallel
9b92b1d3SBarry Smithsolvers comes from building the underlying matrices and vectors in parallel to begin
9b92b1d3SBarry Smithwith. You should see some reduction in the time for the linear solvers.
9b92b1d3SBarry Smith:::
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithYes, you must set up PETSc with MPI (even though you will not use MPI) with at least the
9b92b1d3SBarry Smithfollowing options:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```console
9b92b1d3SBarry Smith$ ./configure --download-superlu_dist --download-parmetis --download-metis --with-openmp
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithYour compiler must support OpenMP. To have the linear solver run in parallel run your
9b92b1d3SBarry Smithprogram with
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```console
9b92b1d3SBarry Smith$ OMP_NUM_THREADS=n ./myprog -pc_type lu -pc_factor_mat_solver superlu_dist
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smithwhere `n` is the number of threads and should be less than or equal to the number of cores
9b92b1d3SBarry Smithavailable.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith:::{note}
9b92b1d3SBarry SmithIf your code is MPI parallel you can also use these same options to have SuperLU_dist
9b92b1d3SBarry Smithutilize multiple threads per MPI process for the direct solver. Make sure that the
9b92b1d3SBarry Smith`$OMP_NUM_THREADS` you use per MPI process is less than or equal to the number of
9b92b1d3SBarry Smithcores available for each MPI process. For example if your compute nodes have 6 cores
9b92b1d3SBarry Smithand you use 2 MPI processes per node then set `$OMP_NUM_THREADS` to 2 or 3.
9b92b1d3SBarry Smith:::
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithAnother approach that allows using a PETSc parallel solver is to use `PCMPI`.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### TS or SNES produces infeasible (out of domain) solutions or states. How can I prevent this?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithFor `TS` call `TSSetFunctionDomainError()`. For both `TS` and `SNES` call
9b92b1d3SBarry Smith`SNESSetFunctionDomainError()` when the solver passes an infeasible (out of domain)
9b92b1d3SBarry Smithsolution or state to your routines.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithIf it occurs for DAEs, it is important to insure the algebraic constraints are well
9b92b1d3SBarry Smithsatisfied, which can prevent "breakdown" later. Thus, one can try using a tight tolerance
9b92b1d3SBarry Smithfor `SNES`, using a direct linear solver (`PCType` of `PCLU`) when possible, and reducing the timestep (or
9b92b1d3SBarry Smithtightening `TS` tolerances for adaptive time stepping).
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### Can PETSc work with Hermitian matrices?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithPETSc's support of Hermitian matrices is limited. Many operations and solvers work
9b92b1d3SBarry Smithfor symmetric (`MATSBAIJ`) matrices and operations on transpose matrices but there is
9b92b1d3SBarry Smithlittle direct support for Hermitian matrices and Hermitian transpose (complex conjugate
9b92b1d3SBarry Smithtranspose) operations. There is `KSPSolveTranspose()` for solving the transpose of a
9b92b1d3SBarry Smithlinear system but no `KSPSolveHermitian()`.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithFor creating known Hermitian matrices:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- `MatCreateNormalHermitian()`
9b92b1d3SBarry Smith- `MatCreateHermitianTranspose()`
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithFor determining or setting Hermitian status on existing matrices:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- `MatIsHermitian()`
9b92b1d3SBarry Smith- `MatIsHermitianKnown()`
9b92b1d3SBarry Smith- `MatIsStructurallySymmetric()`
9b92b1d3SBarry Smith- `MatIsSymmetricKnown()`
9b92b1d3SBarry Smith- `MatIsSymmetric()`
9b92b1d3SBarry Smith- `MatSetOption()` (use with `MAT_SYMMETRIC` or `MAT_HERMITIAN` to assert to PETSc
9b92b1d3SBarry Smith  that either is the case).
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithFor performing matrix operations on known Hermitian matrices (note that regular `Mat`
9b92b1d3SBarry Smithfunctions such as `MatMult()` will of course also work on Hermitian matrices):
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- `MatMultHermitianTranspose()`
9b92b1d3SBarry Smith- `MatMultHermitianTransposeAdd()` (very limited support)
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How can I assemble a bunch of similar matrices?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithYou can first add the values common to all the matrices, then use `MatStoreValues()` to
9b92b1d3SBarry Smithstash the common values. Each iteration you call `MatRetrieveValues()`, then set the
9b92b1d3SBarry Smithunique values in a matrix and assemble.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### Can one resize or change the size of PETSc matrices or vectors?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithNo, once the vector or matrices sizes have been set and the matrices or vectors are fully
9b92b1d3SBarry Smithusable one cannot change the size of the matrices or vectors or number of processors they
9b92b1d3SBarry Smithlive on. One may create new vectors and copy, for example using `VecScatterCreate()`,
9b92b1d3SBarry Smiththe old values from the previous vector.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How can one compute the nullspace of a sparse matrix with MUMPS?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithAssuming you have an existing matrix $A$ whose nullspace $V$ you want to find:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```
9b92b1d3SBarry SmithMat      F, work, V;
9b92b1d3SBarry SmithPetscInt N, rows;
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith/* Determine factorability */
9b92b1d3SBarry SmithPetscCall(MatGetFactor(A, MATSOLVERMUMPS, MAT_FACTOR_LU, &F));
9b92b1d3SBarry SmithPetscCall(MatGetLocalSize(A, &rows, NULL));
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith/* Set MUMPS options, see MUMPS documentation for more information */
9b92b1d3SBarry SmithPetscCall(MatMumpsSetIcntl(F, 24, 1));
9b92b1d3SBarry SmithPetscCall(MatMumpsSetIcntl(F, 25, 1));
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith/* Perform factorization */
9b92b1d3SBarry SmithPetscCall(MatLUFactorSymbolic(F, A, NULL, NULL, NULL));
9b92b1d3SBarry SmithPetscCall(MatLUFactorNumeric(F, A, NULL));
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith/* This is the dimension of the null space */
9b92b1d3SBarry SmithPetscCall(MatMumpsGetInfog(F, 28, &N));
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith/* This will contain the null space in the columns */
9b92b1d3SBarry SmithPetscCall(MatCreateDense(comm, rows, N, PETSC_DETERMINE, PETSC_DETERMINE, NULL, &V));
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithPetscCall(MatDuplicate(V, MAT_DO_NOT_COPY_VALUES, &work));
9b92b1d3SBarry SmithPetscCall(MatMatSolve(F, work, V));
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith______________________________________________________________________
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith## Execution
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### PETSc executables are SO big and take SO long to link
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith:::{note}
9b92b1d3SBarry SmithSee {ref}`shared libraries section <doc_faq_sharedlibs>` for more information.
9b92b1d3SBarry Smith:::
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithWe find this annoying as well. On most machines PETSc can use shared libraries, so
9b92b1d3SBarry Smithexecutables should be much smaller, run `configure` with the additional option
9b92b1d3SBarry Smith`--with-shared-libraries` (this is the default). Also, if you have room, compiling and
9b92b1d3SBarry Smithlinking PETSc on your machine's `/tmp` disk or similar local disk, rather than over the
9b92b1d3SBarry Smithnetwork will be much faster.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How does PETSc's `-help` option work? Why is it different for different programs?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithThere are 2 ways in which one interacts with the options database:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- `PetscOptionsGetXXX()` where `XXX` is some type or data structure (for example
9b92b1d3SBarry Smith  `PetscOptionsGetBool()` or `PetscOptionsGetScalarArray()`). This is a classic
9b92b1d3SBarry Smith  "getter" function, which queries the command line options for a matching option name,
9b92b1d3SBarry Smith  and returns the specified value.
9b92b1d3SBarry Smith- `PetscOptionsXXX()` where `XXX` is some type or data structure (for example
9b92b1d3SBarry Smith  `PetscOptionsBool()` or `PetscOptionsScalarArray()`). This is a so-called "provider"
9b92b1d3SBarry Smith  function. It first records the option name in an internal list of previously encountered
9b92b1d3SBarry Smith  options before calling `PetscOptionsGetXXX()` to query the status of said option.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithWhile users generally use the first option, developers will *always* use the second
9b92b1d3SBarry Smith(provider) variant of functions. Thus, as the program runs, it will build up a list of
9b92b1d3SBarry Smithencountered option names which are then printed **in the order of their appearance on the
9b92b1d3SBarry Smithroot rank**. Different programs may take different paths through PETSc source code, so
9b92b1d3SBarry Smiththey will encounter different providers, and therefore have different `-help` output.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### PETSc has so many options for my program that it is hard to keep them straight
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithRunning the PETSc program with the option `-help` will print out many of the options. To
9b92b1d3SBarry Smithprint the options that have been specified within a program, employ `-options_left` to
9b92b1d3SBarry Smithprint any options that the user specified but were not actually used by the program and
9b92b1d3SBarry Smithall options used; this is helpful for detecting typo errors. The PETSc website has a search option,
9b92b1d3SBarry Smithin the upper right hand corner, that quickly finds answers to most PETSc questions.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### PETSc automatically handles many of the details in parallel PDE solvers. How can I understand what is really happening within my program?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithYou can use the option `-info` to get more details about the solution process. The
9b92b1d3SBarry Smithoption `-log_view` provides details about the distribution of time spent in the various
9b92b1d3SBarry Smithphases of the solution process. You can run with `-ts_view` or `-snes_view` or
9b92b1d3SBarry Smith`-ksp_view` to see what solver options are being used. Run with `-ts_monitor`,
9b92b1d3SBarry Smith`-snes_monitor`, or `-ksp_monitor` to watch convergence of the
9b92b1d3SBarry Smithmethods. `-snes_converged_reason` and `-ksp_converged_reason` will indicate why and if
9b92b1d3SBarry Smiththe solvers have converged.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### Assembling large sparse matrices takes a long time. What can I do to make this process faster? Or MatSetValues() is so slow; what can I do to speed it up?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithYou probably need to do preallocation, as explained in {any}`sec_matsparse`.
9b92b1d3SBarry SmithSee also the {ref}`performance chapter <ch_performance>` of the users manual.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithFor GPUs (and even CPUs) you can use `MatSetPreallocationCOO()` and `MatSetValuesCOO()` for more rapid assembly.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How can I generate performance summaries with PETSc?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithUse this option at runtime:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith-l
9b92b1d3SBarry Smith
9b92b1d3SBarry Smithog_view
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithOutputs a comprehensive timing, memory consumption, and communications digest
9b92b1d3SBarry Smithfor your program. See the {ref}`profiling chapter <ch_profiling>` of the users
9b92b1d3SBarry Smithmanual for information on interpreting the summary data.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How do I know the amount of time spent on each level of the multigrid solver/preconditioner?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithRun with `-log_view` and `-pc_mg_log`
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### Where do I get the input matrices for the examples?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithSome examples use `$DATAFILESPATH/matrices/medium` and other files. These test matrices
9b92b1d3SBarry Smithin PETSc binary format can be found in the [datafiles repository](https://gitlab.com/petsc/datafiles).
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### When I dump some matrices and vectors to binary, I seem to be generating some empty files with `.info` extensions. What's the deal with these?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithPETSc binary viewers put some additional information into `.info` files like matrix
9b92b1d3SBarry Smithblock size. It is harmless but if you *really* don't like it you can use
9b92b1d3SBarry Smith`-viewer_binary_skip_info` or `PetscViewerBinarySkipInfo()`.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith:::{note}
9b92b1d3SBarry SmithYou need to call `PetscViewerBinarySkipInfo()` before
9b92b1d3SBarry Smith`PetscViewerFileSetName()`. In other words you **cannot** use
9b92b1d3SBarry Smith`PetscViewerBinaryOpen()` directly.
9b92b1d3SBarry Smith:::
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### Why is my parallel solver slower than my sequential solver, or I have poor speed-up?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithThis can happen for many reasons:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith1. Make sure it is truly the time in `KSPSolve()` that is slower (by running the code
9b92b1d3SBarry Smith   with `-log_view`). Often the slower time is in generating the matrix or some other
9b92b1d3SBarry Smith   operation.
9b92b1d3SBarry Smith2. There must be enough work for each process to outweigh the communication time. We
9b92b1d3SBarry Smith   recommend an absolute minimum of about 10,000 unknowns per process, better is 20,000 or
9b92b1d3SBarry Smith   more. This is even more true when using multiple GPUs, where you need to have millions
9b92b1d3SBarry Smith   of unknowns per GPU.
9b92b1d3SBarry Smith3. Make sure the {ref}`communication speed of the parallel computer
9b92b1d3SBarry Smith   <doc_faq_general_parallel>` is good enough for parallel solvers.
9b92b1d3SBarry Smith4. Check the number of solver iterates with the parallel solver against the sequential
9b92b1d3SBarry Smith   solver. Most preconditioners require more iterations when used on more processes, this
9b92b1d3SBarry Smith   is particularly true for block Jacobi (the default parallel preconditioner). You can
9b92b1d3SBarry Smith   try `-pc_type asm` (`PCASM`) its iterations scale a bit better for more
9b92b1d3SBarry Smith   processes. You may also consider multigrid preconditioners like `PCMG` or BoomerAMG
9b92b1d3SBarry Smith   in `PCHYPRE`.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith(doc_faq_pipelined)=
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### What steps are necessary to make the pipelined solvers execute efficiently?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithPipelined solvers like `KSPPGMRES`, `KSPPIPECG`, `KSPPIPECR`, and `KSPGROPPCG` may
9b92b1d3SBarry Smithrequire special MPI configuration to effectively overlap reductions with computation. In
9b92b1d3SBarry Smithgeneral, this requires an MPI-3 implementation, an implementation that supports multiple
9b92b1d3SBarry Smiththreads, and use of a "progress thread". Consult the documentation from your vendor or
9b92b1d3SBarry Smithcomputing facility for more details.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- Cray MPI MPT-5.6 MPI-3, by setting `$MPICH_MAX_THREAD_SAFETY` to "multiple"
9b92b1d3SBarry Smith  for threads, plus either `$MPICH_ASYNC_PROGRESS` or
9b92b1d3SBarry Smith  `$MPICH_NEMESIS_ASYNC_PROGRESS`. E.g.
9b92b1d3SBarry Smith  ```console
9b92b1d3SBarry Smith  $ export MPICH_MAX_THREAD_SAFETY=multiple
9b92b1d3SBarry Smith  $ export MPICH_ASYNC_PROGRESS=1
9b92b1d3SBarry Smith  $ export MPICH_NEMESIS_ASYNC_PROGRESS=1
9b92b1d3SBarry Smith  ```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- MPICH version 3.0 and later implements the MPI-3 standard and the default
9b92b1d3SBarry Smith  configuration supports use of threads. Use of a progress thread is configured by
9b92b1d3SBarry Smith  setting the environment variable `$MPICH_ASYNC_PROGRESS`. E.g.
9b92b1d3SBarry Smith  ```console
9b92b1d3SBarry Smith  $ export MPICH_ASYNC_PROGRESS=1
9b92b1d3SBarry Smith  ```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### When using PETSc in single precision mode (`--with-precision=single` when running `configure`) are the operations done in single or double precision?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithPETSc does **NOT** do any explicit conversion of single precision to double before
9b92b1d3SBarry Smithperforming computations; it depends on the hardware and compiler for what happens. For
9b92b1d3SBarry Smithexample, the compiler could choose to put the single precision numbers into the usual
9b92b1d3SBarry Smithdouble precision registers and then use the usual double precision floating point unit. Or
9b92b1d3SBarry Smithit could use SSE2 instructions that work directly on the single precision numbers. It is a
9b92b1d3SBarry Smithbit of a mystery what decisions get made sometimes. There may be compiler flags in some
9b92b1d3SBarry Smithcircumstances that can affect this.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### Why is Newton's method (SNES) not converging, or converges slowly?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithNewton's method may not converge for many reasons, here are some of the most common:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- The Jacobian is wrong (or correct in sequential but not in parallel).
9b92b1d3SBarry Smith- The linear system is {ref}`not solved <doc_faq_execution_kspconv>` or is not solved
9b92b1d3SBarry Smith  accurately enough.
9b92b1d3SBarry Smith- The Jacobian system has a singularity that the linear solver is not handling.
9b92b1d3SBarry Smith- There is a bug in the function evaluation routine.
9b92b1d3SBarry Smith- The function is not continuous or does not have continuous first derivatives (e.g. phase
9b92b1d3SBarry Smith  change or TVD limiters).
9b92b1d3SBarry Smith- The equations may not have a solution (e.g. limit cycle instead of a steady state) or
9b92b1d3SBarry Smith  there may be a "hill" between the initial guess and the steady state (e.g. reactants
9b92b1d3SBarry Smith  must ignite and burn before reaching a steady state, but the steady-state residual will
9b92b1d3SBarry Smith  be larger during combustion).
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithHere are some of the ways to help debug lack of convergence of Newton:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- Run on one processor to see if the problem is only in parallel.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- Run with `-info` to get more detailed information on the solution process.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- Run with the options
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith  ```text
9b92b1d3SBarry Smith  -snes_monitor -ksp_monitor_true_residual -snes_converged_reason -ksp_converged_reason
9b92b1d3SBarry Smith  ```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith  - If the linear solve does not converge, check if the Jacobian is correct, then see
9b92b1d3SBarry Smith    {ref}`this question <doc_faq_execution_kspconv>`.
9b92b1d3SBarry Smith  - If the preconditioned residual converges, but the true residual does not, the
9b92b1d3SBarry Smith    preconditioner may be singular.
9b92b1d3SBarry Smith  - If the linear solve converges well, but the line search fails, the Jacobian may be
9b92b1d3SBarry Smith    incorrect.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- Run with `-pc_type lu` or `-pc_type svd` to see if the problem is a poor linear
9b92b1d3SBarry Smith  solver.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- Run with `-mat_view` or `-mat_view draw` to see if the Jacobian looks reasonable.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- Run with `-snes_test_jacobian -snes_test_jacobian_view` to see if the Jacobian you are
9b92b1d3SBarry Smith  using is wrong. Compare the output when you add `-mat_fd_type ds` to see if the result
9b92b1d3SBarry Smith  is sensitive to the choice of differencing parameter.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- Run with `-snes_mf_operator -pc_type lu` to see if the Jacobian you are using is
9b92b1d3SBarry Smith  wrong. If the problem is too large for a direct solve, try
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith  ```text
9b92b1d3SBarry Smith  -snes_mf_operator -pc_type ksp -ksp_ksp_rtol 1e-12.
9b92b1d3SBarry Smith  ```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith  Compare the output when you add `-mat_mffd_type ds` to see if the result is sensitive
9b92b1d3SBarry Smith  to choice of differencing parameter.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- Run with `-snes_linesearch_monitor` to see if the line search is failing (this is
9b92b1d3SBarry Smith  usually a sign of a bad Jacobian). Use `-info` in PETSc 3.1 and older versions,
9b92b1d3SBarry Smith  `-snes_ls_monitor` in PETSc 3.2 and `-snes_linesearch_monitor` in PETSc 3.3 and
9b92b1d3SBarry Smith  later.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithHere are some ways to help the Newton process if everything above checks out:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- Run with grid sequencing (`-snes_grid_sequence` if working with a `DM` is all you
9b92b1d3SBarry Smith  need) to generate better initial guess on your finer mesh.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- Run with quad precision, i.e.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith  ```console
9b92b1d3SBarry Smith  $ ./configure --with-precision=__float128 --download-f2cblaslapack
9b92b1d3SBarry Smith  ```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith  :::{note}
9b92b1d3SBarry Smith  quad precision requires PETSc 3.2 and later and recent versions of the GNU compilers.
9b92b1d3SBarry Smith  :::
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- Change the units (nondimensionalization), boundary condition scaling, or formulation so
9b92b1d3SBarry Smith  that the Jacobian is better conditioned. See [Buckingham pi theorem](https://en.wikipedia.org/wiki/Buckingham_%CF%80_theorem) and [Dimensional and
9b92b1d3SBarry Smith  Scaling Analysis](https://epubs.siam.org/doi/pdf/10.1137/16M1107127).
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- Mollify features in the function that do not have continuous first derivatives (often
9b92b1d3SBarry Smith  occurs when there are "if" statements in the residual evaluation, e.g. phase change or
9b92b1d3SBarry Smith  TVD limiters). Use a variational inequality solver (`SNESVINEWTONRSLS`) if the
9b92b1d3SBarry Smith  discontinuities are of fundamental importance.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- Try a trust region method (`-ts_type tr`, may have to adjust parameters).
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- Run with some continuation parameter from a point where you know the solution, see
9b92b1d3SBarry Smith  `TSPSEUDO` for steady-states.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- There are homotopy solver packages like PHCpack that can get you all possible solutions
9b92b1d3SBarry Smith  (and tell you that it has found them all) but those are not scalable and cannot solve
9b92b1d3SBarry Smith  anything but small problems.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith(doc_faq_execution_kspconv)=
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### Why is the linear solver (KSP) not converging, or converges slowly?
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith:::{tip}
9b92b1d3SBarry SmithAlways run with `-ksp_converged_reason -ksp_monitor_true_residual` when trying to
9b92b1d3SBarry Smithlearn why a method is not converging!
9b92b1d3SBarry Smith:::
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithCommon reasons for KSP not converging are:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- A symmetric method is being used for a non-symmetric problem.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- The equations are singular by accident (e.g. forgot to impose boundary
9b92b1d3SBarry Smith  conditions). Check this for a small problem using `-pc_type svd -pc_svd_monitor`.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- The equations are intentionally singular (e.g. constant null space), but the Krylov
9b92b1d3SBarry Smith  method was not informed, see `MatSetNullSpace()`. Always inform your local Krylov
9b92b1d3SBarry Smith  subspace solver of any change of singularity. Failure to do so will result in the
9b92b1d3SBarry Smith  immediate revocation of your computing and keyboard operator licenses, as well as
9b92b1d3SBarry Smith  a stern talking-to by the nearest Krylov Subspace Method representative.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- The equations are intentionally singular and `MatSetNullSpace()` was used, but the
9b92b1d3SBarry Smith  right-hand side is not consistent. You may have to call `MatNullSpaceRemove()` on the
9b92b1d3SBarry Smith  right-hand side before calling `KSPSolve()`. See `MatSetTransposeNullSpace()`.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- The equations are indefinite so that standard preconditioners don't work. Usually you
9b92b1d3SBarry Smith  will know this from the physics, but you can check with
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith  ```text
9b92b1d3SBarry Smith  -ksp_compute_eigenvalues -ksp_gmres_restart 1000 -pc_type none
9b92b1d3SBarry Smith  ```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith  For simple saddle point problems, try
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith  ```text
9b92b1d3SBarry Smith  -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_detect_saddle_point
9b92b1d3SBarry Smith  ```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith  For more difficult problems, read the literature to find robust methods and ask
9b92b1d3SBarry Smith  <mailto:petsc-users@mcs.anl.gov> or <mailto:petsc-maint@mcs.anl.gov> if you want advice about how to
9b92b1d3SBarry Smith  implement them.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- If the method converges in preconditioned residual, but not in true residual, the
9b92b1d3SBarry Smith  preconditioner is likely singular or nearly so. This is common for saddle point problems
9b92b1d3SBarry Smith  (e.g. incompressible flow) or strongly nonsymmetric operators (e.g. low-Mach hyperbolic
9b92b1d3SBarry Smith  problems with large time steps).
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- The preconditioner is too weak or is unstable. See if `-pc_type asm -sub_pc_type lu`
9b92b1d3SBarry Smith  improves the convergence rate. If GMRES is losing too much progress in the restart, see
9b92b1d3SBarry Smith  if longer restarts help `-ksp_gmres_restart 300`. If a transpose is available, try
9b92b1d3SBarry Smith  `-ksp_type bcgs` or other methods that do not require a restart.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith  :::{note}
9b92b1d3SBarry Smith  Unfortunately convergence with these methods is frequently erratic.
9b92b1d3SBarry Smith  :::
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- The preconditioner is nonlinear (e.g. a nested iterative solve), try `-ksp_type
9b92b1d3SBarry Smith  fgmres` or `-ksp_type gcr`.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- You are using geometric multigrid, but some equations (often boundary conditions) are
9b92b1d3SBarry Smith  not scaled compatibly between levels. Try `-pc_mg_galerkin` both to algebraically
9b92b1d3SBarry Smith  construct a correctly scaled coarse operator or make sure that all the equations are
9b92b1d3SBarry Smith  scaled in the same way if you want to use rediscretized coarse levels.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- The matrix is very ill-conditioned. Check the {ref}`condition number <doc_faq_usage_condnum>`.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith  - Try to improve it by choosing the relative scaling of components/boundary conditions.
9b92b1d3SBarry Smith  - Try `-ksp_diagonal_scale -ksp_diagonal_scale_fix`.
9b92b1d3SBarry Smith  - Perhaps change the formulation of the problem to produce more friendly algebraic
9b92b1d3SBarry Smith    equations.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- Change the units (nondimensionalization), boundary condition scaling, or formulation so
9b92b1d3SBarry Smith  that the Jacobian is better conditioned. See [Buckingham pi theorem](https://en.wikipedia.org/wiki/Buckingham_%CF%80_theorem) and [Dimensional and
9b92b1d3SBarry Smith  Scaling Analysis](https://epubs.siam.org/doi/pdf/10.1137/16M1107127).
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- Classical Gram-Schmidt is becoming unstable, try `-ksp_gmres_modifiedgramschmidt` or
9b92b1d3SBarry Smith  use a method that orthogonalizes differently, e.g. `-ksp_type gcr`.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### I get the error message: Actual argument at (1) to assumed-type dummy is of derived type with type-bound or FINAL procedures
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithUse the following code-snippet:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```fortran
9b92b1d3SBarry Smithmodule context_module
9b92b1d3SBarry Smith#include petsc/finclude/petsc.h
9b92b1d3SBarry Smithuse petsc
9b92b1d3SBarry Smithimplicit none
9b92b1d3SBarry Smithprivate
9b92b1d3SBarry Smithtype, public ::  context_type
9b92b1d3SBarry Smith  private
9b92b1d3SBarry Smith  PetscInt :: foo
9b92b1d3SBarry Smithcontains
9b92b1d3SBarry Smith  procedure, public :: init => context_init
9b92b1d3SBarry Smithend type context_type
9b92b1d3SBarry Smithcontains
9b92b1d3SBarry Smithsubroutine context_init(self, foo)
9b92b1d3SBarry Smith  class(context_type), intent(in out) :: self
9b92b1d3SBarry Smith  PetscInt, intent(in) :: foo
9b92b1d3SBarry Smith  self%foo = foo
9b92b1d3SBarry Smithend subroutine context_init
9b92b1d3SBarry Smithend module context_module
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith!------------------------------------------------------------------------
9b92b1d3SBarry Smith
9b92b1d3SBarry Smithprogram test_snes
9b92b1d3SBarry Smithuse,intrinsic :: iso_c_binding
9b92b1d3SBarry Smithuse petsc
9b92b1d3SBarry Smithuse context_module
9b92b1d3SBarry Smithimplicit none
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithSNES :: snes
9b92b1d3SBarry Smithtype(context_type),target :: context
9b92b1d3SBarry Smithtype(c_ptr) :: contextOut
9b92b1d3SBarry SmithPetscErrorCode :: ierr
9b92b1d3SBarry Smith
9b92b1d3SBarry Smithcall PetscInitialize(PETSC_NULL_CHARACTER, ierr)
9b92b1d3SBarry Smithcall SNESCreate(PETSC_COMM_WORLD, snes, ierr)
9b92b1d3SBarry Smithcall context%init(1)
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithcontextOut = c_loc(context) ! contextOut is a C pointer on context
9b92b1d3SBarry Smith
9b92b1d3SBarry Smithcall SNESSetConvergenceTest(snes, convergence, contextOut, PETSC_NULL_FUNCTION, ierr)
9b92b1d3SBarry Smith
9b92b1d3SBarry Smithcall SNESDestroy(snes, ierr)
9b92b1d3SBarry Smithcall PetscFinalize(ierr)
9b92b1d3SBarry Smith
9b92b1d3SBarry Smithcontains
9b92b1d3SBarry Smith
9b92b1d3SBarry Smithsubroutine convergence(snes, num_iterations, xnorm, pnorm,fnorm, reason, contextIn, ierr)
9b92b1d3SBarry SmithSNES, intent(in) :: snes
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithPetscInt, intent(in) :: num_iterations
9b92b1d3SBarry SmithPetscReal, intent(in) :: xnorm, pnorm, fnorm
9b92b1d3SBarry SmithSNESConvergedReason, intent(out) :: reason
9b92b1d3SBarry Smithtype(c_ptr), intent(in out) :: contextIn
9b92b1d3SBarry Smithtype(context_type), pointer :: context
9b92b1d3SBarry SmithPetscErrorCode, intent(out) :: ierr
9b92b1d3SBarry Smith
9b92b1d3SBarry Smithcall c_f_pointer(contextIn,context)  ! convert the C pointer to a Fortran pointer to use context as in the main program
9b92b1d3SBarry Smithreason = 0
9b92b1d3SBarry Smithierr = 0
9b92b1d3SBarry Smithend subroutine convergence
9b92b1d3SBarry Smithend program test_snes
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### In C++ I get a crash on `VecDestroy()` (or some other PETSc object) at the end of the program
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithThis can happen when the destructor for a C++ class is automatically called at the end of
9b92b1d3SBarry Smiththe program after `PetscFinalize()`. Use the following code-snippet:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smithmain()
9b92b1d3SBarry Smith{
9b92b1d3SBarry Smith  PetscCall(PetscInitialize());
9b92b1d3SBarry Smith  {
9b92b1d3SBarry Smith    your variables
9b92b1d3SBarry Smith    your code
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith    ...   /* all your destructors are called here automatically by C++ so they work correctly */
9b92b1d3SBarry Smith  }
9b92b1d3SBarry Smith  PetscCall(PetscFinalize());
9b92b1d3SBarry Smith  return 0
9b92b1d3SBarry Smith}
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith______________________________________________________________________
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith## Debugging
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### What does the message hwloc/linux: Ignoring PCI device with non-16bit domain mean?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithThis is printed by the hwloc library that is used by some MPI implementations. It can be ignored.
9b92b1d3SBarry SmithTo prevent the message from always being printed set the environmental variable `HWLOC_HIDE_ERRORS` to 2.
9b92b1d3SBarry SmithFor example
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smithexport HWLOC_HIDE_ERRORS=2
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smithwhich can be added to your login profile file such as `~/.bashrc`.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How do I turn off PETSc signal handling so I can use the `-C` Option On `xlf`?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithImmediately after calling `PetscInitialize()` call `PetscPopSignalHandler()`.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithSome Fortran compilers including the IBM xlf, xlF etc compilers have a compile option
9b92b1d3SBarry Smith(`-C` for IBM's) that causes all array access in Fortran to be checked that they are
9b92b1d3SBarry Smithin-bounds. This is a great feature but does require that the array dimensions be set
9b92b1d3SBarry Smithexplicitly, not with a \*.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How do I debug if `-start_in_debugger` does not work on my machine?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithThe script <https://github.com/Azrael3000/tmpi> can be used to run multiple MPI
9b92b1d3SBarry Smithranks in the debugger using tmux.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithOn newer macOS machines - one has to be in admin group to be able to use debugger.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithOn newer Ubuntu linux machines - one has to disable `ptrace_scope` with
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```console
9b92b1d3SBarry Smith$ sudo echo 0 > /proc/sys/kernel/yama/ptrace_scope
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smithto get start in debugger working.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithIf `-start_in_debugger` does not work on your OS, for a uniprocessor job, just
9b92b1d3SBarry Smithtry the debugger directly, for example: `gdb ex1`. You can also use [TotalView](https://totalview.io/products/totalview) which is a good graphical parallel debugger.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How do I see where my code is hanging?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithYou can use the `-start_in_debugger` option to start all processes in the debugger (each
9b92b1d3SBarry Smithwill come up in its own xterm) or run in [TotalView](https://totalview.io/products/totalview). Then use `cont` (for continue) in each
9b92b1d3SBarry Smithxterm. Once you are sure that the program is hanging, hit control-c in each xterm and then
9b92b1d3SBarry Smithuse 'where' to print a stack trace for each process.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How can I inspect PETSc vector and matrix values when in the debugger?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithI will illustrate this with `gdb`, but it should be similar on other debuggers. You can
9b92b1d3SBarry Smithlook at local `Vec` values directly by obtaining the array. For a `Vec` v, we can
9b92b1d3SBarry Smithprint all local values using:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```console
9b92b1d3SBarry Smith(gdb) p ((Vec_Seq*) v->data)->array[0]@v->map.n
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithHowever, this becomes much more complicated for a matrix. Therefore, it is advisable to use the default viewer to look at the object. For a `Vec` v and a `Mat` m, this would be:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```console
9b92b1d3SBarry Smith(gdb) call VecView(v, 0)
9b92b1d3SBarry Smith(gdb) call MatView(m, 0)
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smithor with a communicator other than `MPI_COMM_WORLD`:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```console
9b92b1d3SBarry Smith(gdb) call MatView(m, PETSC_VIEWER_STDOUT_(m->comm))
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithTotalview 8.8.0+ has a new feature that allows libraries to provide their own code to
9b92b1d3SBarry Smithdisplay objects in the debugger. Thus in theory each PETSc object, `Vec`, `Mat` etc
9b92b1d3SBarry Smithcould have custom code to print values in the object. We have only done this for the most
9b92b1d3SBarry Smithelementary display of `Vec` and `Mat`. See the routine `TV_display_type()` in
9b92b1d3SBarry Smith`src/vec/vec/interface/vector.c` for an example of how these may be written. Contact us
9b92b1d3SBarry Smithif you would like to add more.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How can I find the cause of floating point exceptions like not-a-number (NaN) or infinity?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithThe best way to locate floating point exceptions is to use a debugger. On supported
9b92b1d3SBarry Smitharchitectures (including Linux and glibc-based systems), just run in a debugger and pass
9b92b1d3SBarry Smith`-fp_trap` to the PETSc application. This will activate signaling exceptions and the
9b92b1d3SBarry Smithdebugger will break on the line that first divides by zero or otherwise generates an
9b92b1d3SBarry Smithexceptions.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithWithout a debugger, running with `-fp_trap` in debug mode will only identify the
9b92b1d3SBarry Smithfunction in which the error occurred, but not the line or the type of exception. If
9b92b1d3SBarry Smith`-fp_trap` is not supported on your architecture, consult the documentation for your
9b92b1d3SBarry Smithdebugger since there is likely a way to have it catch exceptions.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith(error_libimf)=
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### Error while loading shared libraries: libimf.so: cannot open shared object file: No such file or directory
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithThe Intel compilers use shared libraries (like libimf) that cannot be found, by default, at run
9b92b1d3SBarry Smithtime. When using the Intel compilers (and running the resulting code) you must make sure
9b92b1d3SBarry Smiththat the proper Intel initialization scripts are run. This is usually done by adding some
9b92b1d3SBarry Smithcode into your `.cshrc`, `.bashrc`, `.profile` etc file. Sometimes on batch file
9b92b1d3SBarry Smithsystems that do now access your initialization files (like .cshrc) you must include the
9b92b1d3SBarry Smithinitialization calls in your batch file submission.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithFor example, on my Mac using `csh` I have the following in my `.cshrc` file:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```csh
9b92b1d3SBarry Smithsource /opt/intel/cc/10.1.012/bin/iccvars.csh
9b92b1d3SBarry Smithsource /opt/intel/fc/10.1.012/bin/ifortvars.csh
9b92b1d3SBarry Smithsource /opt/intel/idb/10.1.012/bin/idbvars.csh
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithAnd in my `.profile` I have
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```csh
9b92b1d3SBarry Smithsource /opt/intel/cc/10.1.012/bin/iccvars.sh
9b92b1d3SBarry Smithsource /opt/intel/fc/10.1.012/bin/ifortvars.sh
9b92b1d3SBarry Smithsource /opt/intel/idb/10.1.012/bin/idbvars.sh
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith(object_type_not_set)=
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### What does "Object Type Not Set: Argument # N" Mean?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithMany operations on PETSc objects require that the specific type of the object be set before the operations is performed. You must call `XXXSetType()` or `XXXSetFromOptions()` before you make the offending call. For example
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```
9b92b1d3SBarry SmithMat A;
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithPetscCall(MatCreate(PETSC_COMM_WORLD, &A));
9b92b1d3SBarry SmithPetscCall(MatSetValues(A,....));
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smithwill not work. You must add `MatSetType()` or `MatSetFromOptions()` before the call to `MatSetValues()`. I.e.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```
9b92b1d3SBarry SmithMat A;
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithPetscCall(MatCreate(PETSC_COMM_WORLD, &A));
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithPetscCall(MatSetType(A, MATAIJ));
9b92b1d3SBarry Smith/* Will override MatSetType() */
9b92b1d3SBarry SmithPetscCall(MatSetFromOptions());
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithPetscCall(MatSetValues(A,....));
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith(split_ownership)=
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### What does error detected in `PetscSplitOwnership()` about "sum of local lengths ...": mean?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithIn a previous call to `VecSetSizes()`, `MatSetSizes()`, `VecCreateXXX()` or
9b92b1d3SBarry Smith`MatCreateXXX()` you passed in local and global sizes that do not make sense for the
9b92b1d3SBarry Smithcorrect number of processors. For example if you pass in a local size of 2 and a global
9b92b1d3SBarry Smithsize of 100 and run on two processors, this cannot work since the sum of the local sizes
9b92b1d3SBarry Smithis 4, not 100.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith(valgrind)=
9b92b1d3SBarry Smith
*74df5e01SJunchao Zhang### What does "corrupt argument" or "caught signal" Or "SEGV" Or "segmentation violation" Or "bus error" mean? Can I use Valgrind or CUDA Compute Sanitizer to debug memory corruption issues?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithSometimes it can mean an argument to a function is invalid. In Fortran this may be caused
9b92b1d3SBarry Smithby forgetting to list an argument in the call, especially the final `PetscErrorCode`.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithOtherwise it is usually caused by memory corruption; that is somewhere the code is writing
9b92b1d3SBarry Smithout of array bounds. To track this down rerun the debug version of the code with the
9b92b1d3SBarry Smithoption `-malloc_debug`. Occasionally the code may crash only with the optimized version,
9b92b1d3SBarry Smithin that case run the optimized version with `-malloc_debug`. If you determine the
9b92b1d3SBarry Smithproblem is from memory corruption you can put the macro CHKMEMQ in the code near the crash
9b92b1d3SBarry Smithto determine exactly what line is causing the problem.
9b92b1d3SBarry Smith
*74df5e01SJunchao ZhangIf `-malloc_debug` does not help: on NVIDIA CUDA systems you can use <https://docs.nvidia.com/compute-sanitizer/ComputeSanitizer/index.html>,
*74df5e01SJunchao Zhangfor example, `compute-sanitizer --tool memcheck [sanitizer_options] app_name [app_options]`.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithIf `-malloc_debug` does not help: on GNU/Linux (not macOS machines) - you can
9b92b1d3SBarry Smithuse [valgrind](http://valgrind.org). Follow the below instructions:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith1. `configure` PETSc with `--download-mpich --with-debugging` (you can use other MPI implementations but most produce spurious Valgrind messages)
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith2. Compile your application code with this build of PETSc.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith3. Run with Valgrind.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith   ```console
9b92b1d3SBarry Smith   $ $PETSC_DIR/lib/petsc/bin/petscmpiexec -valgrind -n NPROC PETSCPROGRAMNAME PROGRAMOPTIONS
9b92b1d3SBarry Smith   ```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith   or
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith   ```console
9b92b1d3SBarry Smith   $ mpiexec -n NPROC valgrind --tool=memcheck -q --num-callers=20 \
9b92b1d3SBarry Smith   --suppressions=$PETSC_DIR/share/petsc/suppressions/valgrind \
9b92b1d3SBarry Smith   --log-file=valgrind.log.%p PETSCPROGRAMNAME -malloc off PROGRAMOPTIONS
9b92b1d3SBarry Smith   ```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith:::{note}
9b92b1d3SBarry Smith- option `--with-debugging` enables valgrind to give stack trace with additional
9b92b1d3SBarry Smith  source-file:line-number info.
9b92b1d3SBarry Smith- option `--download-mpich` is valgrind clean, other MPI builds are not valgrind clean.
9b92b1d3SBarry Smith- when `--download-mpich` is used - mpiexec will be in `$PETSC_ARCH/bin`
9b92b1d3SBarry Smith- `--log-file=valgrind.log.%p` option tells valgrind to store the output from each
9b92b1d3SBarry Smith  process in a different file \[as %p i.e PID, is different for each MPI process.
9b92b1d3SBarry Smith- `memcheck` will not find certain array access that violate static array
9b92b1d3SBarry Smith  declarations so if memcheck runs clean you can try the `--tool=exp-ptrcheck`
9b92b1d3SBarry Smith  instead.
9b92b1d3SBarry Smith:::
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithYou might also consider using <http://drmemory.org> which has support for GNU/Linux, Apple
9b92b1d3SBarry SmithMac OS and Microsoft Windows machines. (Note we haven't tried this ourselves).
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith(zeropivot)=
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### What does "detected zero pivot in LU factorization" mean?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithA zero pivot in LU, ILU, Cholesky, or ICC sparse factorization does not always mean that
9b92b1d3SBarry Smiththe matrix is singular. You can use
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```text
9b92b1d3SBarry Smith-pc_factor_shift_type nonzero -pc_factor_shift_amount [amount]
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smithor
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```text
9b92b1d3SBarry Smith-pc_factor_shift_type positive_definite -[level]_pc_factor_shift_type nonzero
9b92b1d3SBarry Smith -pc_factor_shift_amount [amount]
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smithor
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```text
9b92b1d3SBarry Smith-[level]_pc_factor_shift_type positive_definite
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smithto prevent the zero pivot. [level] is "sub" when lu, ilu, cholesky, or icc are employed in
9b92b1d3SBarry Smitheach individual block of the bjacobi or ASM preconditioner. [level] is "mg_levels" or
9b92b1d3SBarry Smith"mg_coarse" when lu, ilu, cholesky, or icc are used inside multigrid smoothers or to the
9b92b1d3SBarry Smithcoarse grid solver. See `PCFactorSetShiftType()`, `PCFactorSetShiftAmount()`.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithThis error can also happen if your matrix is singular, see `MatSetNullSpace()` for how
9b92b1d3SBarry Smithto handle this. If this error occurs in the zeroth row of the matrix, it is likely you
9b92b1d3SBarry Smithhave an error in the code that generates the matrix.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### You create draw windows or `PETSCVIEWERDRAW` windows or use options `-ksp_monitor draw::draw_lg` or `-snes_monitor draw::draw_lg` and the program seems to run OK but windows never open
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithThe libraries were compiled without support for X windows. Make sure that `configure`
9b92b1d3SBarry Smithwas run with the option `--with-x`.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### The program seems to use more and more memory as it runs, even though you don't think you are allocating more memory
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithSome of the following may be occurring:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- You are creating new PETSc objects but never freeing them.
9b92b1d3SBarry Smith- There is a memory leak in PETSc or your code.
9b92b1d3SBarry Smith- Something much more subtle: (if you are using Fortran). When you declare a large array
9b92b1d3SBarry Smith  in Fortran, the operating system does not allocate all the memory pages for that array
9b92b1d3SBarry Smith  until you start using the different locations in the array. Thus, in a code, if at each
9b92b1d3SBarry Smith  step you start using later values in the array your virtual memory usage will "continue"
9b92b1d3SBarry Smith  to increase as measured by `ps` or `top`.
9b92b1d3SBarry Smith- You are running with the `-log`, `-log_mpe`, or `-log_all` option. With these
9b92b1d3SBarry Smith  options, a great deal of logging information is stored in memory until the conclusion of
9b92b1d3SBarry Smith  the run.
9b92b1d3SBarry Smith- You are linking with the MPI profiling libraries; these cause logging of all MPI
9b92b1d3SBarry Smith  activities. Another symptom is at the conclusion of the run it may print some message
9b92b1d3SBarry Smith  about writing log files.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithThe following may help:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- Run with the `-malloc_debug` option and `-malloc_view`. Or use `PetscMallocDump()`
9b92b1d3SBarry Smith  and `PetscMallocView()` sprinkled about your code to track memory that is allocated
9b92b1d3SBarry Smith  and not later freed. Use the commands `PetscMallocGetCurrentUsage()` and
9b92b1d3SBarry Smith  `PetscMemoryGetCurrentUsage()` to monitor memory allocated and
9b92b1d3SBarry Smith  `PetscMallocGetMaximumUsage()` and `PetscMemoryGetMaximumUsage()` for total memory
9b92b1d3SBarry Smith  used as the code progresses.
9b92b1d3SBarry Smith- This is just the way Unix works and is harmless.
9b92b1d3SBarry Smith- Do not use the `-log`, `-log_mpe`, or `-log_all` option, or use
9b92b1d3SBarry Smith  `PetscLogEventDeactivate()` or `PetscLogEventDeactivateClass()` to turn off logging of
9b92b1d3SBarry Smith  specific events.
9b92b1d3SBarry Smith- Make sure you do not link with the MPI profiling libraries.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### When calling `MatPartitioningApply()` you get a message "Error! Key 16615 Not Found"
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithThe graph of the matrix you are using is not symmetric. You must use symmetric matrices
9b92b1d3SBarry Smithfor partitioning.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### With GMRES at restart the second residual norm printed does not match the first
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithI.e.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```text
9b92b1d3SBarry Smith26 KSP Residual norm 3.421544615851e-04
9b92b1d3SBarry Smith27 KSP Residual norm 2.973675659493e-04
9b92b1d3SBarry Smith28 KSP Residual norm 2.588642948270e-04
9b92b1d3SBarry Smith29 KSP Residual norm 2.268190747349e-04
9b92b1d3SBarry Smith30 KSP Residual norm 1.977245964368e-04
9b92b1d3SBarry Smith30 KSP Residual norm 1.994426291979e-04 <----- At restart the residual norm is printed a second time
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithThis is actually not surprising! GMRES computes the norm of the residual at each iteration
9b92b1d3SBarry Smithvia a recurrence relation between the norms of the residuals at the previous iterations
9b92b1d3SBarry Smithand quantities computed at the current iteration. It does not compute it via directly
9b92b1d3SBarry Smith$|| b - A x^{n} ||$.
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithSometimes, especially with an ill-conditioned matrix, or computation of the matrix-vector
9b92b1d3SBarry Smithproduct via differencing, the residual norms computed by GMRES start to "drift" from the
9b92b1d3SBarry Smithcorrect values. At the restart, we compute the residual norm directly, hence the "strange
9b92b1d3SBarry Smithstuff," the difference printed. The drifting, if it remains small, is harmless (doesn't
9b92b1d3SBarry Smithaffect the accuracy of the solution that GMRES computes).
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithIf you use a more powerful preconditioner the drift will often be smaller and less
9b92b1d3SBarry Smithnoticeable. Of if you are running matrix-free you may need to tune the matrix-free
9b92b1d3SBarry Smithparameters.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### Why do some Krylov methods seem to print two residual norms per iteration?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithI.e.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```text
9b92b1d3SBarry Smith1198 KSP Residual norm 1.366052062216e-04
9b92b1d3SBarry Smith1198 KSP Residual norm 1.931875025549e-04
9b92b1d3SBarry Smith1199 KSP Residual norm 1.366026406067e-04
9b92b1d3SBarry Smith1199 KSP Residual norm 1.931819426344e-04
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithSome Krylov methods, for example `KSPTFQMR`, actually have a "sub-iteration" of size 2
9b92b1d3SBarry Smithinside the loop. Each of the two substeps has its own matrix vector product and
9b92b1d3SBarry Smithapplication of the preconditioner and updates the residual approximations. This is why you
9b92b1d3SBarry Smithget this "funny" output where it looks like there are two residual norms per
9b92b1d3SBarry Smithiteration. You can also think of it as twice as many iterations.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### Unable to locate PETSc dynamic library `libpetsc`
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithWhen using DYNAMIC libraries - the libraries cannot be moved after they are
9b92b1d3SBarry Smithinstalled. This could also happen on clusters - where the paths are different on the (run)
9b92b1d3SBarry Smithnodes - than on the (compile) front-end. **Do not use dynamic libraries & shared
9b92b1d3SBarry Smithlibraries**. Run `configure` with
9b92b1d3SBarry Smith`--with-shared-libraries=0 --with-dynamic-loading=0`.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith:::{important}
f0b74427SPierre JolivetThis option has been removed in PETSc v3.5
9b92b1d3SBarry Smith:::
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How do I determine what update to PETSc broke my code?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithIf at some point (in PETSc code history) you had a working code - but the latest PETSc
9b92b1d3SBarry Smithcode broke it, its possible to determine the PETSc code change that might have caused this
9b92b1d3SBarry Smithbehavior. This is achieved by:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- Using Git to access PETSc sources
9b92b1d3SBarry Smith- Knowing the Git commit for the known working version of PETSc
9b92b1d3SBarry Smith- Knowing the Git commit for the known broken version of PETSc
9b92b1d3SBarry Smith- Using the [bisect](https://mirrors.edge.kernel.org/pub/software/scm/git/docs/git-bisect.html)
9b92b1d3SBarry Smith  functionality of Git
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithGit bisect can be done as follows:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith1. Get PETSc development (main branch in git) sources
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith   ```console
9b92b1d3SBarry Smith   $ git clone https://gitlab.com/petsc/petsc.git
9b92b1d3SBarry Smith   ```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith2. Find the good and bad markers to start the bisection process. This can be done either
9b92b1d3SBarry Smith   by checking `git log` or `gitk` or <https://gitlab.com/petsc/petsc> or the web
9b92b1d3SBarry Smith   history of petsc-release clones. Lets say the known bad commit is 21af4baa815c and
9b92b1d3SBarry Smith   known good commit is 5ae5ab319844.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith3. Start the bisection process with these known revisions. Build PETSc, and test your code
9b92b1d3SBarry Smith   to confirm known good/bad behavior:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith   ```console
9b92b1d3SBarry Smith   $ git bisect start 21af4baa815c 5ae5ab319844
9b92b1d3SBarry Smith   ```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith   build/test, perhaps discover that this new state is bad
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith   ```console
9b92b1d3SBarry Smith   $ git bisect bad
9b92b1d3SBarry Smith   ```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith   build/test, perhaps discover that this state is good
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith   ```console
9b92b1d3SBarry Smith   $ git bisect good
9b92b1d3SBarry Smith   ```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith   Now until done - keep bisecting, building PETSc, and testing your code with it and
9b92b1d3SBarry Smith   determine if the code is working or not. After something like 5-15 iterations, `git
9b92b1d3SBarry Smith   bisect` will pin-point the exact code change that resulted in the difference in
9b92b1d3SBarry Smith   application behavior.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith:::{tip}
9b92b1d3SBarry SmithSee [git-bisect(1)](https://mirrors.edge.kernel.org/pub/software/scm/git/docs/git-bisect.html) and the
9b92b1d3SBarry Smith[debugging section of the Git Book](https://git-scm.com/book/en/Git-Tools-Debugging-with-Git) for more debugging tips.
9b92b1d3SBarry Smith:::
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How to fix the error "PMIX Error: error in file gds_ds12_lock_pthread.c"?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithThis seems to be an error when using Open MPI and OpenBLAS with threads (or perhaps other
9b92b1d3SBarry Smithpackages that use threads).
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```console
9b92b1d3SBarry Smith$ export PMIX_MCA_gds=hash
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithShould resolve the problem.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith______________________________________________________________________
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith(doc_faq_sharedlibs)=
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith## Shared Libraries
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### Can I install PETSc libraries as shared libraries?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithYes. Use
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```console
9b92b1d3SBarry Smith$ ./configure --with-shared-libraries
9b92b1d3SBarry Smith```
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### Why should I use shared libraries?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithWhen you link to shared libraries, the function symbols from the shared libraries are not
9b92b1d3SBarry Smithcopied in the executable. This way the size of the executable is considerably smaller than
9b92b1d3SBarry Smithwhen using regular libraries. This helps in a couple of ways:
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith- Saves disk space when more than one executable is created
9b92b1d3SBarry Smith- Improves the compile time immensely, because the compiler has to write a much smaller
9b92b1d3SBarry Smith  file (executable) to the disk.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### How do I link to the PETSc shared libraries?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithBy default, the compiler should pick up the shared libraries instead of the regular
9b92b1d3SBarry Smithones. Nothing special should be done for this.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### What if I want to link to the regular `.a` library files?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithYou must run `configure` with the option `--with-shared-libraries=0` (you can use a
9b92b1d3SBarry Smithdifferent `$PETSC_ARCH` for this build so you can easily switch between the two).
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith### What do I do if I want to move my executable to a different machine?
9b92b1d3SBarry Smith
9b92b1d3SBarry SmithYou would also need to have access to the shared libraries on this new machine. The other
9b92b1d3SBarry Smithalternative is to build the executable without shared libraries by first deleting the
9b92b1d3SBarry Smithshared libraries, and then creating the executable.
9b92b1d3SBarry Smith
9b92b1d3SBarry Smith```{bibliography} /petsc.bib
9b92b1d3SBarry Smith:filter: docname in docnames
9b92b1d3SBarry Smith```