xref: /petsc/doc/faq/index.md (revision ede9db9363e1fdaaa09befd664c8164883ccce80)
19b92b1d3SBarry Smith(doc_faq)=
29b92b1d3SBarry Smith
39b92b1d3SBarry Smith# FAQ
49b92b1d3SBarry Smith
59b92b1d3SBarry Smith```{contents} Table Of Contents
69b92b1d3SBarry Smith:backlinks: top
79b92b1d3SBarry Smith:local: true
89b92b1d3SBarry Smith```
99b92b1d3SBarry Smith
109b92b1d3SBarry Smith______________________________________________________________________
119b92b1d3SBarry Smith
129b92b1d3SBarry Smith## General
139b92b1d3SBarry Smith
149b92b1d3SBarry Smith### How can I subscribe to the PETSc mailing lists?
159b92b1d3SBarry Smith
169b92b1d3SBarry SmithSee mailing list {ref}`documentation <doc_mail>`
179b92b1d3SBarry Smith
189b92b1d3SBarry Smith### Any useful books on numerical computing?
199b92b1d3SBarry Smith
209b92b1d3SBarry Smith[Bueler, PETSc for Partial Differential Equations: Numerical Solutions in C and Python](https://my.siam.org/Store/Product/viewproduct/?ProductId=32850137)
219b92b1d3SBarry Smith
229b92b1d3SBarry Smith[Oliveira and Stewart, Writing Scientific Software: A Guide to Good Style](https://www.cambridge.org/core/books/writing-scientific-software/23206704175AF868E43FE3FB399C2F53)
239b92b1d3SBarry Smith
249b92b1d3SBarry Smith(doc_faq_general_parallel)=
259b92b1d3SBarry Smith
269b92b1d3SBarry Smith### What kind of parallel computers or clusters are needed to use PETSc? Or why do I get little speedup?
279b92b1d3SBarry Smith
289b92b1d3SBarry Smith:::{important}
299b92b1d3SBarry SmithPETSc can be used with any kind of parallel system that supports MPI BUT for any decent
309b92b1d3SBarry Smithperformance one needs:
319b92b1d3SBarry Smith
329b92b1d3SBarry Smith- Fast, **low-latency** interconnect; any ethernet (even 10 GigE) simply cannot provide
339b92b1d3SBarry Smith  the needed performance.
349b92b1d3SBarry Smith- High per-core **memory** performance. Each core needs to
359b92b1d3SBarry Smith  have its **own** memory bandwidth of at least 2 or more gigabytes/second. Most modern
369b92b1d3SBarry Smith  computers are not bottlenecked by how fast they can perform
379b92b1d3SBarry Smith  calculations; rather, they are usually restricted by how quickly they can get their
389b92b1d3SBarry Smith  data.
399b92b1d3SBarry Smith:::
409b92b1d3SBarry Smith
419b92b1d3SBarry SmithTo obtain good performance it is important that you know your machine! I.e. how many
429b92b1d3SBarry Smithcompute nodes (generally, how many motherboards), how many memory sockets per node and how
439b92b1d3SBarry Smithmany cores per memory socket and how much memory bandwidth for each.
449b92b1d3SBarry Smith
459b92b1d3SBarry SmithIf you do not know this and can run MPI programs with mpiexec (that is, you don't have
469b92b1d3SBarry Smithbatch system), run the following from `$PETSC_DIR`:
479b92b1d3SBarry Smith
489b92b1d3SBarry Smith```console
499b92b1d3SBarry Smith$ make streams [NPMAX=maximum_number_of_mpi_processes_you_plan_to_use]
509b92b1d3SBarry Smith```
519b92b1d3SBarry Smith
529b92b1d3SBarry SmithThis will provide a summary of the bandwidth with different number of MPI
539b92b1d3SBarry Smithprocesses and potential speedups. See {any}`ch_streams` for a detailed discussion.
549b92b1d3SBarry Smith
559b92b1d3SBarry SmithIf you have a batch system:
569b92b1d3SBarry Smith
579b92b1d3SBarry Smith```console
589b92b1d3SBarry Smith$ cd $PETSC_DIR/src/benchmarks/streams
599b92b1d3SBarry Smith$ make MPIVersion
609b92b1d3SBarry Smithsubmit MPIVersion to the batch system a number of times with 1, 2, 3, etc. MPI processes
619b92b1d3SBarry Smithcollecting all of the output from the runs into the single file scaling.log. Copy
629b92b1d3SBarry Smithscaling.log into the src/benchmarks/streams directory.
639b92b1d3SBarry Smith$ ./process.py createfile ; process.py
649b92b1d3SBarry Smith```
659b92b1d3SBarry Smith
669b92b1d3SBarry SmithEven if you have enough memory bandwidth if the OS switches processes between cores
679b92b1d3SBarry Smithperformance can degrade. Smart process to core/socket binding (this just means locking a
689b92b1d3SBarry Smithprocess to a particular core or memory socket) may help you. For example, consider using
699b92b1d3SBarry Smithfewer processes than cores and binding processes to separate sockets so that each process
709b92b1d3SBarry Smithuses a different memory bus:
719b92b1d3SBarry Smith
729b92b1d3SBarry Smith- [MPICH2 binding with the Hydra process manager](https://github.com/pmodels/mpich/blob/main/doc/wiki/how_to/Using_the_Hydra_Process_Manager.md#process-core-binding]
739b92b1d3SBarry Smith
749b92b1d3SBarry Smith  ```console
759b92b1d3SBarry Smith  $ mpiexec.hydra -n 4 --binding cpu:sockets
769b92b1d3SBarry Smith  ```
779b92b1d3SBarry Smith
789b92b1d3SBarry Smith- [Open MPI binding](https://www.open-mpi.org/faq/?category=tuning#using-paffinity)
799b92b1d3SBarry Smith
809b92b1d3SBarry Smith  ```console
819b92b1d3SBarry Smith  $ mpiexec -n 4 --map-by socket --bind-to socket --report-bindings
829b92b1d3SBarry Smith  ```
839b92b1d3SBarry Smith
849b92b1d3SBarry Smith- `taskset`, part of the [util-linux](https://github.com/karelzak/util-linux) package
859b92b1d3SBarry Smith
869b92b1d3SBarry Smith  Check `man taskset` for details. Make sure to set affinity for **your** program,
879b92b1d3SBarry Smith  **not** for the `mpiexec` program.
889b92b1d3SBarry Smith
899b92b1d3SBarry Smith- `numactl`
909b92b1d3SBarry Smith
919b92b1d3SBarry Smith  In addition to task affinity, this tool also allows changing the default memory affinity
929b92b1d3SBarry Smith  policy. On Linux, the default policy is to attempt to find memory on the same memory bus
939b92b1d3SBarry Smith  that serves the core that a thread is running on when the memory is faulted
949b92b1d3SBarry Smith  (not when `malloc()` is called). If local memory is not available, it is found
959b92b1d3SBarry Smith  elsewhere, possibly leading to serious memory imbalances.
969b92b1d3SBarry Smith
979b92b1d3SBarry Smith  The option `--localalloc` allocates memory on the local NUMA node, similar to the
989b92b1d3SBarry Smith  `numa_alloc_local()` function in the `libnuma` library. The option
999b92b1d3SBarry Smith  `--cpunodebind=nodes` binds the process to a given NUMA node (note that this can be
1009b92b1d3SBarry Smith  larger or smaller than a CPU (socket); a NUMA node usually has multiple cores).
1019b92b1d3SBarry Smith
1029b92b1d3SBarry Smith  The option `--physcpubind=cpus` binds the process to a given processor core (numbered
1039b92b1d3SBarry Smith  according to `/proc/cpuinfo`, therefore including logical cores if Hyper-threading is
1049b92b1d3SBarry Smith  enabled).
1059b92b1d3SBarry Smith
1069b92b1d3SBarry Smith  With Open MPI, you can use knowledge of the NUMA hierarchy and core numbering on your
1079b92b1d3SBarry Smith  machine to calculate the correct NUMA node or processor number given the environment
1089b92b1d3SBarry Smith  variable `OMPI_COMM_WORLD_LOCAL_RANK`. In most cases, it is easier to make mpiexec or
1099b92b1d3SBarry Smith  a resource manager set affinities.
1109b92b1d3SBarry Smith
1119b92b1d3SBarry Smith### What kind of license is PETSc released under?
1129b92b1d3SBarry Smith
1139b92b1d3SBarry SmithSee licensing {ref}`documentation <doc_license>`
1149b92b1d3SBarry Smith
1159b92b1d3SBarry Smith### Why is PETSc written in C, instead of Fortran or C++?
1169b92b1d3SBarry Smith
1179b92b1d3SBarry SmithWhen this decision was made, in the early 1990s, C enabled us to build data structures
1189b92b1d3SBarry Smithfor storing sparse matrices, solver information,
1199b92b1d3SBarry Smithetc. in ways that Fortran simply did not allow. ANSI C was a complete standard that all
1209b92b1d3SBarry Smithmodern C compilers supported. The language was identical on all machines. C++ was still
1219b92b1d3SBarry Smithevolving and compilers on different machines were not identical. Using C function pointers
1229b92b1d3SBarry Smithto provide data encapsulation and polymorphism allowed us to get many of the advantages of
1239b92b1d3SBarry SmithC++ without using such a large and more complicated language. It would have been natural and
1249b92b1d3SBarry Smithreasonable to have coded PETSc in C++; we opted to use C instead.
1259b92b1d3SBarry Smith
1269b92b1d3SBarry Smith### Does all the PETSc error checking and logging reduce PETSc's efficiency?
1279b92b1d3SBarry Smith
1289b92b1d3SBarry SmithNo
1299b92b1d3SBarry Smith
1309b92b1d3SBarry Smith(doc_faq_maintenance_strats)=
1319b92b1d3SBarry Smith
1329b92b1d3SBarry Smith### How do such a small group of people manage to write and maintain such a large and marvelous package as PETSc?
1339b92b1d3SBarry Smith
1349b92b1d3SBarry Smith1. **We work very efficiently**.
1359b92b1d3SBarry Smith
1369b92b1d3SBarry Smith   - We use powerful editors and programming environments.
1379b92b1d3SBarry Smith   - Our manual pages are generated automatically from formatted comments in the code,
1389b92b1d3SBarry Smith     thus alleviating the need for creating and maintaining manual pages.
1399b92b1d3SBarry Smith   - We employ continuous integration testing of the entire PETSc library on many different
1409b92b1d3SBarry Smith     machine architectures. This process **significantly** protects (no bug-catching
1419b92b1d3SBarry Smith     process is perfect) against inadvertently introducing bugs with new additions. Every
1429b92b1d3SBarry Smith     new feature **must** pass our suite of thousands of tests as well as formal code
1439b92b1d3SBarry Smith     review before it may be included.
1449b92b1d3SBarry Smith
1459b92b1d3SBarry Smith2. **We are very careful in our design (and are constantly revising our design)**
1469b92b1d3SBarry Smith
1479b92b1d3SBarry Smith   - PETSc as a package should be easy to use, write, and maintain. Our mantra is to write
1489b92b1d3SBarry Smith     code like everyone is using it.
1499b92b1d3SBarry Smith
1509b92b1d3SBarry Smith3. **We are willing to do the grunt work**
1519b92b1d3SBarry Smith
1529b92b1d3SBarry Smith   - PETSc is regularly checked to make sure that all code conforms to our interface
1539b92b1d3SBarry Smith     design. We will never keep in a bad design decision simply because changing it will
1549b92b1d3SBarry Smith     require a lot of editing; we do a lot of editing.
1559b92b1d3SBarry Smith
1569b92b1d3SBarry Smith4. **We constantly seek out and experiment with new design ideas**
1579b92b1d3SBarry Smith
1589b92b1d3SBarry Smith   - We retain the useful ones and discard the rest. All of these decisions are based not
1599b92b1d3SBarry Smith     just on performance, but also on **practicality**.
1609b92b1d3SBarry Smith
1619b92b1d3SBarry Smith5. **Function and variable names must adhere to strict guidelines**
1629b92b1d3SBarry Smith
1639b92b1d3SBarry Smith   - Even the rules about capitalization are designed to make it easy to figure out the
1649b92b1d3SBarry Smith     name of a particular object or routine. Our memories are terrible, so careful
1659b92b1d3SBarry Smith     consistent naming puts less stress on our limited human RAM.
1669b92b1d3SBarry Smith
1679b92b1d3SBarry Smith6. **The PETSc directory tree is designed to make it easy to move throughout the
1689b92b1d3SBarry Smith   entire package**
1699b92b1d3SBarry Smith
1709b92b1d3SBarry Smith7. **We have a rich, robust, and fast bug reporting system**
1719b92b1d3SBarry Smith
1729b92b1d3SBarry Smith   - <mailto:petsc-maint@mcs.anl.gov> is always checked, and we pride ourselves on responding
1739b92b1d3SBarry Smith     quickly and accurately. Email is very lightweight, and so bug reports system retains
1749b92b1d3SBarry Smith     an archive of all reported problems and fixes, so it is easy to re-find fixes to
1759b92b1d3SBarry Smith     previously discovered problems.
1769b92b1d3SBarry Smith
1779b92b1d3SBarry Smith8. **We contain the complexity of PETSc by using powerful object-oriented programming
1789b92b1d3SBarry Smith   techniques**
1799b92b1d3SBarry Smith
1809b92b1d3SBarry Smith   - Data encapsulation serves to abstract complex data formats or movement to
1819b92b1d3SBarry Smith     human-readable format. This is why your program cannot, for example, look directly
1829b92b1d3SBarry Smith     at what is inside the object `Mat`.
1839b92b1d3SBarry Smith   - Polymorphism makes changing program behavior as easy as possible, and further
1849b92b1d3SBarry Smith     abstracts the *intent* of your program from what is *written* in code. You call
1859b92b1d3SBarry Smith     `MatMult()` regardless of whether your matrix is dense, sparse, parallel or
1869b92b1d3SBarry Smith     sequential; you don't call a different routine for each format.
1879b92b1d3SBarry Smith
1889b92b1d3SBarry Smith9. **We try to provide the functionality requested by our users**
1899b92b1d3SBarry Smith
1909b92b1d3SBarry Smith### For complex numbers will I get better performance with C++?
1919b92b1d3SBarry Smith
1929b92b1d3SBarry SmithTo use PETSc with complex numbers you may use the following `configure` options;
1939b92b1d3SBarry Smith`--with-scalar-type=complex` and either `--with-clanguage=c++` or (the default)
1949b92b1d3SBarry Smith`--with-clanguage=c`. In our experience they will deliver very similar performance
1959b92b1d3SBarry Smith(speed), but if one is concerned they should just try both and see if one is faster.
1969b92b1d3SBarry Smith
1979b92b1d3SBarry Smith### How come when I run the same program on the same number of processes I get a "different" answer?
1989b92b1d3SBarry Smith
1999b92b1d3SBarry SmithInner products and norms in PETSc are computed using the `MPI_Allreduce()` command. For
2009b92b1d3SBarry Smithdifferent runs the order at which values arrive at a given process (via MPI) can be in a
2019b92b1d3SBarry Smithdifferent order, thus the order in which some floating point arithmetic operations are
2029b92b1d3SBarry Smithperformed will be different. Since floating point arithmetic is not
2039b92b1d3SBarry Smithassociative, the computed quantity may be slightly different.
2049b92b1d3SBarry Smith
2059b92b1d3SBarry SmithOver a run the many slight differences in the inner products and norms will effect all the
2069b92b1d3SBarry Smithcomputed results. It is important to realize that none of the computed answers are any
2079b92b1d3SBarry Smithless right or wrong (in fact the sequential computation is no more right then the parallel
2089b92b1d3SBarry Smithones). All answers are equal, but some are more equal than others.
2099b92b1d3SBarry Smith
2109b92b1d3SBarry SmithThe discussion above assumes that the exact same algorithm is being used on the different
2119b92b1d3SBarry Smithnumber of processes. When the algorithm is different for the different number of processes
2129b92b1d3SBarry Smith(almost all preconditioner algorithms except Jacobi are different for different number of
2139b92b1d3SBarry Smithprocesses) then one expects to see (and does) a greater difference in results for
2149b92b1d3SBarry Smithdifferent numbers of processes. In some cases (for example block Jacobi preconditioner) it
2159b92b1d3SBarry Smithmay be that the algorithm works for some number of processes and does not work for others.
2169b92b1d3SBarry Smith
2179b92b1d3SBarry Smith### How come when I run the same linear solver on a different number of processes it takes a different number of iterations?
2189b92b1d3SBarry Smith
2199b92b1d3SBarry SmithThe convergence of many of the preconditioners in PETSc including the default parallel
2209b92b1d3SBarry Smithpreconditioner block Jacobi depends on the number of processes. The more processes the
2219b92b1d3SBarry Smith(slightly) slower convergence it has. This is the nature of iterative solvers, the more
2229b92b1d3SBarry Smithparallelism means the more "older" information is used in the solution process hence
2239b92b1d3SBarry Smithslower convergence.
2249b92b1d3SBarry Smith
2259b92b1d3SBarry Smith(doc_faq_gpuhowto)=
2269b92b1d3SBarry Smith
2279b92b1d3SBarry Smith### Can PETSc use GPUs to speed up computations?
2289b92b1d3SBarry Smith
2299b92b1d3SBarry Smith:::{seealso}
2309b92b1d3SBarry SmithSee GPU development {ref}`roadmap <doc_gpu_roadmap>` for the latest information
2319b92b1d3SBarry Smithregarding the state of PETSc GPU integration.
2329b92b1d3SBarry Smith
2339b92b1d3SBarry SmithSee GPU install {ref}`documentation <doc_config_accel>` for up-to-date information on
2349b92b1d3SBarry Smithinstalling PETSc to use GPU's.
2359b92b1d3SBarry Smith:::
2369b92b1d3SBarry Smith
2379b92b1d3SBarry SmithQuick summary of usage with CUDA:
2389b92b1d3SBarry Smith
2399b92b1d3SBarry Smith- The `VecType` `VECSEQCUDA`, `VECMPICUDA`, or `VECCUDA` may be used with
2409b92b1d3SBarry Smith  `VecSetType()` or `-vec_type seqcuda`, `mpicuda`, or `cuda` when
2419b92b1d3SBarry Smith  `VecSetFromOptions()` is used.
2429b92b1d3SBarry Smith- The `MatType` `MATSEQAIJCUSPARSE`, `MATMPIAIJCUSPARSE`, or `MATAIJCUSPARSE`
2439b92b1d3SBarry Smith  may be used with `MatSetType()` or `-mat_type seqaijcusparse`, `mpiaijcusparse`, or
2449b92b1d3SBarry Smith  `aijcusparse` when `MatSetFromOptions()` is used.
2459b92b1d3SBarry Smith- If you are creating the vectors and matrices with a `DM`, you can use `-dm_vec_type
2469b92b1d3SBarry Smith  cuda` and `-dm_mat_type aijcusparse`.
2479b92b1d3SBarry Smith
2489b92b1d3SBarry SmithQuick summary of usage with OpenCL (provided by the ViennaCL library):
2499b92b1d3SBarry Smith
2509b92b1d3SBarry Smith- The `VecType` `VECSEQVIENNACL`, `VECMPIVIENNACL`, or `VECVIENNACL` may be used
2519b92b1d3SBarry Smith  with `VecSetType()` or `-vec_type seqviennacl`, `mpiviennacl`, or `viennacl`
2529b92b1d3SBarry Smith  when `VecSetFromOptions()` is used.
2539b92b1d3SBarry Smith- The `MatType` `MATSEQAIJVIENNACL`, `MATMPIAIJVIENNACL`, or `MATAIJVIENNACL`
2549b92b1d3SBarry Smith  may be used with `MatSetType()` or `-mat_type seqaijviennacl`, `mpiaijviennacl`, or
2559b92b1d3SBarry Smith  `aijviennacl` when `MatSetFromOptions()` is used.
2569b92b1d3SBarry Smith- If you are creating the vectors and matrices with a `DM`, you can use `-dm_vec_type
2579b92b1d3SBarry Smith  viennacl` and `-dm_mat_type aijviennacl`.
2589b92b1d3SBarry Smith
2599b92b1d3SBarry SmithGeneral hints:
2609b92b1d3SBarry Smith
2619b92b1d3SBarry Smith- It is useful to develop your code with the default vectors and then run production runs
2629b92b1d3SBarry Smith  with the command line options to use the GPU since debugging on GPUs is difficult.
2639b92b1d3SBarry Smith- All of the Krylov methods except `KSPIBCGS` run on the GPU.
2649b92b1d3SBarry Smith- Parts of most preconditioners run directly on the GPU. After setup, `PCGAMG` runs
2659b92b1d3SBarry Smith  fully on GPUs, without any memory copies between the CPU and GPU.
2669b92b1d3SBarry Smith
2679b92b1d3SBarry SmithSome GPU systems (for example many laptops) only run with single precision; thus, PETSc
2689b92b1d3SBarry Smithmust be built with the `configure` option `--with-precision=single`.
2699b92b1d3SBarry Smith
2709b92b1d3SBarry Smith(doc_faq_extendedprecision)=
2719b92b1d3SBarry Smith
2729b92b1d3SBarry Smith### Can I run PETSc with extended precision?
2739b92b1d3SBarry Smith
2749b92b1d3SBarry SmithYes, with gcc and gfortran. `configure` PETSc using the
2759b92b1d3SBarry Smithoptions `--with-precision=__float128` and `--download-f2cblaslapack`.
2769b92b1d3SBarry Smith
2779b92b1d3SBarry Smith:::{admonition} Warning
2789b92b1d3SBarry Smith:class: yellow
2799b92b1d3SBarry Smith
2809b92b1d3SBarry SmithExternal packages are not guaranteed to work in this mode!
2819b92b1d3SBarry Smith:::
2829b92b1d3SBarry Smith
2839b92b1d3SBarry Smith### Why doesn't PETSc use Qd to implement support for extended precision?
2849b92b1d3SBarry Smith
2859b92b1d3SBarry SmithWe tried really hard but could not. The problem is that the QD c++ classes, though they
2869b92b1d3SBarry Smithtry to, implement the built-in data types of `double` are not native types and cannot
2879b92b1d3SBarry Smith"just be used" in a general piece of numerical source code. Ratherm the code has to
2889b92b1d3SBarry Smithrewritten to live within the limitations of QD classes. However PETSc can be built to use
2899b92b1d3SBarry Smithquad precision, as detailed {ref}`here <doc_faq_extendedprecision>`.
2909b92b1d3SBarry Smith
2919b92b1d3SBarry Smith### How do I cite PETSc?
2929b92b1d3SBarry Smith
2939b92b1d3SBarry SmithUse {any}`these citations <doc_index_citing_petsc>`.
2949b92b1d3SBarry Smith
2959b92b1d3SBarry Smith______________________________________________________________________
2969b92b1d3SBarry Smith
2979b92b1d3SBarry Smith## Installation
2989b92b1d3SBarry Smith
2999b92b1d3SBarry Smith### How do I begin using PETSc if the software has already been completely built and installed by someone else?
3009b92b1d3SBarry Smith
3019b92b1d3SBarry SmithAssuming that the PETSc libraries have been successfully built for a particular
3029b92b1d3SBarry Smitharchitecture and level of optimization, a new user must merely:
3039b92b1d3SBarry Smith
3049b92b1d3SBarry Smith1. Set `PETSC_DIR` to the full path of the PETSc home
3059b92b1d3SBarry Smith   directory. This will be the location of the `configure` script, and usually called
3069b92b1d3SBarry Smith   "petsc" or some variation of that (for example, /home/username/petsc).
3079b92b1d3SBarry Smith2. Set `PETSC_ARCH`, which indicates the configuration on which PETSc will be
3089b92b1d3SBarry Smith   used. Note that `$PETSC_ARCH` is simply a name the installer used when installing
3099b92b1d3SBarry Smith   the libraries. There will exist a directory within `$PETSC_DIR` that is named after
3109b92b1d3SBarry Smith   its corresponding `$PETSC_ARCH`. There many be several on a single system, for
3119b92b1d3SBarry Smith   example "linux-c-debug" for the debug versions compiled by a C compiler or
3129b92b1d3SBarry Smith   "linux-c-opt" for the optimized version.
3139b92b1d3SBarry Smith
3149b92b1d3SBarry Smith:::{admonition} Still Stuck?
3159b92b1d3SBarry SmithSee the {ref}`quick-start tutorial <tut_install>` for a step-by-step guide on
3169b92b1d3SBarry Smithinstalling PETSc, in case you have missed a step.
3179b92b1d3SBarry Smith
3187f296bb3SBarry SmithSee the users manual section on {ref}`getting started <sec_getting_started>`.
3199b92b1d3SBarry Smith:::
3209b92b1d3SBarry Smith
3219b92b1d3SBarry Smith### The PETSc distribution is SO Large. How can I reduce my disk space usage?
3229b92b1d3SBarry Smith
3239b92b1d3SBarry Smith1. The PETSc users manual is provided in PDF format at `$PETSC_DIR/manual.pdf`. You
3249b92b1d3SBarry Smith   can delete this.
3259b92b1d3SBarry Smith2. The PETSc test suite contains sample output for many of the examples. These are
3269b92b1d3SBarry Smith   contained in the PETSc directories `$PETSC_DIR/src/*/tutorials/output` and
3279b92b1d3SBarry Smith   `$PETSC_DIR/src/*/tests/output`. Once you have run the test examples, you may remove
3289b92b1d3SBarry Smith   all of these directories to save some disk space. You can locate the largest with
3299b92b1d3SBarry Smith   e.g. `find . -name output -type d | xargs du -sh | sort -hr` on a Unix-based system.
3309b92b1d3SBarry Smith3. The debugging versions of the libraries are larger than the optimized versions. In a
3319b92b1d3SBarry Smith   pinch you can work with the optimized version, although we bid you good luck in
3329b92b1d3SBarry Smith   finding bugs as it is much easier with the debug version.
3339b92b1d3SBarry Smith
3349b92b1d3SBarry Smith### I want to use PETSc only for uniprocessor programs. Must I still install and use a version of MPI?
3359b92b1d3SBarry Smith
3369b92b1d3SBarry SmithNo, run `configure` with the option `--with-mpi=0`
3379b92b1d3SBarry Smith
3389b92b1d3SBarry Smith### Can I install PETSc to not use X windows (either under Unix or Microsoft Windows with GCC)?
3399b92b1d3SBarry Smith
3409b92b1d3SBarry SmithYes. Run `configure` with the additional flag `--with-x=0`
3419b92b1d3SBarry Smith
3429b92b1d3SBarry Smith### Why do you use MPI?
3439b92b1d3SBarry Smith
3449b92b1d3SBarry SmithMPI is the message-passing standard. Because it is a standard, it will not frequently change over
3459b92b1d3SBarry Smithtime; thus, we do not have to change PETSc every time the provider of the message-passing
3469b92b1d3SBarry Smithsystem decides to make an interface change. MPI was carefully designed by experts from
3479b92b1d3SBarry Smithindustry, academia, and government labs to provide the highest quality performance and
3489b92b1d3SBarry Smithcapability.
3499b92b1d3SBarry Smith
3509b92b1d3SBarry SmithFor example, the careful design of communicators in MPI allows the easy nesting of
3519b92b1d3SBarry Smithdifferent libraries; no other message-passing system provides this support. All of the
3529b92b1d3SBarry Smithmajor parallel computer vendors were involved in the design of MPI and have committed to
3539b92b1d3SBarry Smithproviding quality implementations.
3549b92b1d3SBarry Smith
3559b92b1d3SBarry SmithIn addition, since MPI is a standard, several different groups have already provided
3569b92b1d3SBarry Smithcomplete free implementations. Thus, one does not have to rely on the technical skills of
3579b92b1d3SBarry Smithone particular group to provide the message-passing libraries. Today, MPI is the only
3589b92b1d3SBarry Smithpractical, portable approach to writing efficient parallel numerical software.
3599b92b1d3SBarry Smith
3609b92b1d3SBarry Smith(invalid_mpi_compilers)=
3619b92b1d3SBarry Smith
3629b92b1d3SBarry Smith### What do I do if my MPI compiler wrappers are invalid?
3639b92b1d3SBarry Smith
3649b92b1d3SBarry SmithMost MPI implementations provide compiler wrappers (such as `mpicc`) which give the
3659b92b1d3SBarry Smithinclude and link options necessary to use that version of MPI to the underlying compilers
3669b92b1d3SBarry Smith. Configuration will fail if these wrappers are either absent or broken in the MPI pointed to by
3679b92b1d3SBarry Smith`--with-mpi-dir`. You can rerun the configure with the additional option
3689b92b1d3SBarry Smith`--with-mpi-compilers=0`, which will try to auto-detect working compilers; however,
3699b92b1d3SBarry Smiththese compilers may be incompatible with the particular MPI build. If this fix does not
3709b92b1d3SBarry Smithwork, run with `--with-cc=[your_c_compiler]` where you know `your_c_compiler` works
3719b92b1d3SBarry Smithwith this particular MPI, and likewise for C++ (`--with-cxx=[your_cxx_compiler]`) and Fortran
3729b92b1d3SBarry Smith(`--with-fc=[your_fortran_compiler]`).
3739b92b1d3SBarry Smith
3749b92b1d3SBarry Smith(bit_indices)=
3759b92b1d3SBarry Smith
3769b92b1d3SBarry Smith### When should/can I use the `configure` option `--with-64-bit-indices`?
3779b92b1d3SBarry Smith
3789b92b1d3SBarry SmithBy default the type that PETSc uses to index into arrays and keep sizes of arrays is a
3799b92b1d3SBarry Smith`PetscInt` defined to be a 32-bit `int`. If your problem:
3809b92b1d3SBarry Smith
3819b92b1d3SBarry Smith- Involves more than 2^31 - 1 unknowns (around 2 billion).
3829b92b1d3SBarry Smith- Your matrix might contain more than 2^31 - 1 nonzeros on a single process.
3839b92b1d3SBarry Smith
3849b92b1d3SBarry SmithThen you need to use this option. Otherwise you will get strange crashes.
3859b92b1d3SBarry Smith
3869b92b1d3SBarry SmithThis option can be used when you are using either 32 or 64-bit pointers. You do not
3879b92b1d3SBarry Smithneed to use this option if you are using 64-bit pointers unless the two conditions above
3889b92b1d3SBarry Smithhold.
3899b92b1d3SBarry Smith
3909b92b1d3SBarry Smith### What if I get an internal compiler error?
3919b92b1d3SBarry Smith
3929b92b1d3SBarry SmithYou can rebuild the offending file individually with a lower optimization level. **Then
3939b92b1d3SBarry Smithmake sure to complain to the compiler vendor and file a bug report**. For example, if the
3949b92b1d3SBarry Smithcompiler chokes on `src/mat/impls/baij/seq/baijsolvtrannat.c` you can run the following
3959b92b1d3SBarry Smithfrom `$PETSC_DIR`:
3969b92b1d3SBarry Smith
3979b92b1d3SBarry Smith```console
3989b92b1d3SBarry Smith$ make -f gmakefile PCC_FLAGS="-O1" $PETSC_ARCH/obj/src/mat/impls/baij/seq/baijsolvtrannat.o
3999b92b1d3SBarry Smith$ make all
4009b92b1d3SBarry Smith```
4019b92b1d3SBarry Smith
4029b92b1d3SBarry Smith### How do I enable Python bindings (petsc4py) with PETSc?
4039b92b1d3SBarry Smith
4049b92b1d3SBarry Smith1. Install [Cython](https://cython.org/).
4059b92b1d3SBarry Smith2. `configure` PETSc with the `--with-petsc4py=1` option.
4069b92b1d3SBarry Smith3. set `PYTHONPATH=$PETSC_DIR/$PETSC_ARCH/lib`
4079b92b1d3SBarry Smith
4089b92b1d3SBarry Smith(macos_gfortran)=
4099b92b1d3SBarry Smith
4109b92b1d3SBarry Smith### What Fortran compiler do you recommend on macOS?
4119b92b1d3SBarry Smith
4129b92b1d3SBarry SmithWe recommend using [homebrew](https://brew.sh/) to install [gfortran](https://gcc.gnu.org/wiki/GFortran), see {any}`doc_macos_install`
4139b92b1d3SBarry Smith
4149b92b1d3SBarry Smith### How can I find the URL locations of the packages you install using `--download-PACKAGE`?
4159b92b1d3SBarry Smith
4169b92b1d3SBarry Smith```console
4179b92b1d3SBarry Smith$ grep "self.download " $PETSC_DIR/config/BuildSystem/config/packages/*.py
4189b92b1d3SBarry Smith```
4199b92b1d3SBarry Smith
4209b92b1d3SBarry Smith### How to fix the problem: PETSc was configured with one MPICH (or Open MPI) `mpi.h` version but now appears to be compiling using a different MPICH (or Open MPI) `mpi.h` version
4219b92b1d3SBarry Smith
4229b92b1d3SBarry SmithThis happens for generally one of two reasons:
4239b92b1d3SBarry Smith
4249b92b1d3SBarry Smith- You previously ran `configure` with the option `--download-mpich` (or `--download-openmpi`)
4259b92b1d3SBarry Smith  but later ran `configure` to use a version of MPI already installed on the
4269b92b1d3SBarry Smith  machine. Solution:
4279b92b1d3SBarry Smith
4289b92b1d3SBarry Smith  ```console
4299b92b1d3SBarry Smith  $ rm -rf $PETSC_DIR/$PETSC_ARCH
4309b92b1d3SBarry Smith  $ ./configure --your-args
4319b92b1d3SBarry Smith  ```
4329b92b1d3SBarry Smith
4339b92b1d3SBarry Smith(mpi_network_misconfigure)=
4349b92b1d3SBarry Smith
4359b92b1d3SBarry Smith### What does it mean when `make check` hangs or errors on `PetscOptionsInsertFile()`?
4369b92b1d3SBarry Smith
4379b92b1d3SBarry SmithFor example:
4389b92b1d3SBarry Smith
4399b92b1d3SBarry Smith```none
4409b92b1d3SBarry SmithPossible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes
4419b92b1d3SBarry SmithSee https://petsc.org/release/faq/
4429b92b1d3SBarry Smith[0]PETSC ERROR: #1 PetscOptionsInsertFile() line 563 in /Users/barrysmith/Src/PETSc/src/sys/objects/options.c
4439b92b1d3SBarry Smith[0]PETSC ERROR: #2 PetscOptionsInsert() line 720 in /Users/barrysmith/Src/PETSc/src/sys/objects/options.c
4449b92b1d3SBarry Smith[0]PETSC ERROR: #3 PetscInitialize() line 828 in /Users/barrysmith/Src/PETSc/src/sys/objects/pinit.c
4459b92b1d3SBarry Smith```
4469b92b1d3SBarry Smith
4479b92b1d3SBarry Smithor
4489b92b1d3SBarry Smith
4499b92b1d3SBarry Smith```none
4509b92b1d3SBarry Smith$ make check
4519b92b1d3SBarry SmithRunning check examples to verify correct installation
4529b92b1d3SBarry SmithUsing PETSC_DIR=/Users/barrysmith/Src/petsc and PETSC_ARCH=arch-fix-mpiexec-hang-2-ranks
4539b92b1d3SBarry SmithC/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
4549b92b1d3SBarry SmithPROGRAM SEEMS TO BE HANGING HERE
4559b92b1d3SBarry Smith```
4569b92b1d3SBarry Smith
4579b92b1d3SBarry SmithThis usually occurs when network settings are misconfigured (perhaps due to VPN) resulting in a failure or hang in system call `gethostbyname()`.
4589b92b1d3SBarry Smith
4599b92b1d3SBarry Smith- Verify you are using the correct `mpiexec` for the MPI you have linked PETSc with.
4609b92b1d3SBarry Smith
4619b92b1d3SBarry Smith- If you have a VPN enabled on your machine, try turning it off and then running `make check` to
4629b92b1d3SBarry Smith  verify that it is not the VPN playing poorly with MPI.
4639b92b1d3SBarry Smith
4647f296bb3SBarry Smith- If  ``ping `hostname` `` (`/sbin/ping` on macOS) fails or hangs do:
4659b92b1d3SBarry Smith
4669b92b1d3SBarry Smith  ```none
4679b92b1d3SBarry Smith  echo 127.0.0.1 `hostname` | sudo tee -a /etc/hosts
4689b92b1d3SBarry Smith  ```
4699b92b1d3SBarry Smith
4709b92b1d3SBarry Smith  and try `make check` again.
4719b92b1d3SBarry Smith
4729b92b1d3SBarry Smith- Try completely disconnecting your machine from the network and see if `make check` then works
4739b92b1d3SBarry Smith
4749b92b1d3SBarry Smith- Try the PETSc `configure` option `--download-mpich-device=ch3:nemesis` with `--download-mpich`.
4759b92b1d3SBarry Smith
4769b92b1d3SBarry Smith______________________________________________________________________
4779b92b1d3SBarry Smith
4789b92b1d3SBarry Smith## Usage
4799b92b1d3SBarry Smith
4809b92b1d3SBarry Smith### How can I redirect PETSc's `stdout` and `stderr` when programming with a GUI interface in Windows Developer Studio or to C++ streams?
4819b92b1d3SBarry Smith
4829b92b1d3SBarry SmithTo overload just the error messages write your own `MyPrintError()` function that does
4839b92b1d3SBarry Smithwhatever you want (including pop up windows etc) and use it like below.
4849b92b1d3SBarry Smith
4859b92b1d3SBarry Smith```c
4869b92b1d3SBarry Smithextern "C" {
4879b92b1d3SBarry Smith  int PASCAL WinMain(HINSTANCE,HINSTANCE,LPSTR,int);
4889b92b1d3SBarry Smith};
4899b92b1d3SBarry Smith
4909b92b1d3SBarry Smith#include <petscsys.h>
4919b92b1d3SBarry Smith#include <mpi.h>
4929b92b1d3SBarry Smith
4939b92b1d3SBarry Smithconst char help[] = "Set up from main";
4949b92b1d3SBarry Smith
4959b92b1d3SBarry Smithint MyPrintError(const char error[], ...)
4969b92b1d3SBarry Smith{
4979b92b1d3SBarry Smith  printf("%s", error);
4989b92b1d3SBarry Smith  return 0;
4999b92b1d3SBarry Smith}
5009b92b1d3SBarry Smith
5019b92b1d3SBarry Smithint main(int ac, char *av[])
5029b92b1d3SBarry Smith{
5039b92b1d3SBarry Smith  char           buf[256];
5049b92b1d3SBarry Smith  HINSTANCE      inst;
5059b92b1d3SBarry Smith  PetscErrorCode ierr;
5069b92b1d3SBarry Smith
5079b92b1d3SBarry Smith  inst = (HINSTANCE)GetModuleHandle(NULL);
5089b92b1d3SBarry Smith  PetscErrorPrintf = MyPrintError;
5099b92b1d3SBarry Smith
5109b92b1d3SBarry Smith  buf[0] = 0;
5119b92b1d3SBarry Smith  for (int i = 1; i < ac; ++i) {
5129b92b1d3SBarry Smith    strcat(buf, av[i]);
5139b92b1d3SBarry Smith    strcat(buf, " ");
5149b92b1d3SBarry Smith  }
5159b92b1d3SBarry Smith
5169b92b1d3SBarry Smith  PetscCall(PetscInitialize(&ac, &av, NULL, help));
5179b92b1d3SBarry Smith
5189b92b1d3SBarry Smith  return WinMain(inst, NULL, buf, SW_SHOWNORMAL);
5199b92b1d3SBarry Smith}
5209b92b1d3SBarry Smith```
5219b92b1d3SBarry Smith
5229b92b1d3SBarry SmithPlace this file in the project and compile with this preprocessor definitions:
5239b92b1d3SBarry Smith
5249b92b1d3SBarry Smith```
5259b92b1d3SBarry SmithWIN32
5269b92b1d3SBarry Smith_DEBUG
5279b92b1d3SBarry Smith_CONSOLE
5289b92b1d3SBarry Smith_MBCS
5299b92b1d3SBarry SmithUSE_PETSC_LOG
5309b92b1d3SBarry SmithUSE_PETSC_BOPT_g
5319b92b1d3SBarry SmithUSE_PETSC_STACK
5329b92b1d3SBarry Smith_AFXDLL
5339b92b1d3SBarry Smith```
5349b92b1d3SBarry Smith
5359b92b1d3SBarry SmithAnd these link options:
5369b92b1d3SBarry Smith
5379b92b1d3SBarry Smith```
5389b92b1d3SBarry Smith/nologo
5399b92b1d3SBarry Smith/subsystem:console
5409b92b1d3SBarry Smith/incremental:yes
5419b92b1d3SBarry Smith/debug
5429b92b1d3SBarry Smith/machine:I386
5439b92b1d3SBarry Smith/nodefaultlib:"libcmtd.lib"
5449b92b1d3SBarry Smith/nodefaultlib:"libcd.lib"
5459b92b1d3SBarry Smith/nodefaultlib:"mvcrt.lib"
5469b92b1d3SBarry Smith/pdbtype:sept
5479b92b1d3SBarry Smith```
5489b92b1d3SBarry Smith
5499b92b1d3SBarry Smith:::{note}
5509b92b1d3SBarry SmithThe above is compiled and linked as if it was a console program. The linker will search
5519b92b1d3SBarry Smithfor a main, and then from it the `WinMain` will start. This works with MFC templates and
5529b92b1d3SBarry Smithderived classes too.
5539b92b1d3SBarry Smith
5549b92b1d3SBarry SmithWhen writing a Window's console application you do not need to do anything, the `stdout`
5559b92b1d3SBarry Smithand `stderr` is automatically output to the console window.
5569b92b1d3SBarry Smith:::
5579b92b1d3SBarry Smith
5589b92b1d3SBarry SmithTo change where all PETSc `stdout` and `stderr` go, (you can also reassign
5599b92b1d3SBarry Smith`PetscVFPrintf()` to handle `stdout` and `stderr` any way you like) write the
5609b92b1d3SBarry Smithfollowing function:
5619b92b1d3SBarry Smith
5629b92b1d3SBarry Smith```
5639b92b1d3SBarry SmithPetscErrorCode mypetscvfprintf(FILE *fd, const char format[], va_list Argp)
5649b92b1d3SBarry Smith{
5659b92b1d3SBarry Smith  PetscFunctionBegin;
5669b92b1d3SBarry Smith  if (fd != stdout && fd != stderr) { /* handle regular files */
5679b92b1d3SBarry Smith    PetscCall(PetscVFPrintfDefault(fd, format, Argp));
5689b92b1d3SBarry Smith  } else {
5699b92b1d3SBarry Smith    char buff[1024]; /* Make sure to assign a large enough buffer */
5709b92b1d3SBarry Smith    int  length;
5719b92b1d3SBarry Smith
5729b92b1d3SBarry Smith    PetscCall(PetscVSNPrintf(buff, 1024, format, &length, Argp));
5739b92b1d3SBarry Smith
5749b92b1d3SBarry Smith    /* now send buff to whatever stream or whatever you want */
5759b92b1d3SBarry Smith  }
5769b92b1d3SBarry Smith  PetscFunctionReturn(PETSC_SUCCESS);
5779b92b1d3SBarry Smith}
5789b92b1d3SBarry Smith```
5799b92b1d3SBarry Smith
5809b92b1d3SBarry SmithThen assign `PetscVFPrintf = mypetscprintf` before `PetscInitialize()` in your main
5819b92b1d3SBarry Smithprogram.
5829b92b1d3SBarry Smith
5839b92b1d3SBarry Smith### I want to use Hypre boomerAMG without GMRES but when I run `-pc_type hypre -pc_hypre_type boomeramg -ksp_type preonly` I don't get a very accurate answer!
5849b92b1d3SBarry Smith
5859b92b1d3SBarry SmithYou should run with `-ksp_type richardson` to have PETSc run several V or W
5869b92b1d3SBarry Smithcycles. `-ksp_type preonly` causes boomerAMG to use only one V/W cycle. You can control
5879b92b1d3SBarry Smithhow many cycles are used in a single application of the boomerAMG preconditioner with
5889b92b1d3SBarry Smith`-pc_hypre_boomeramg_max_iter <it>` (the default is 1). You can also control the
5899b92b1d3SBarry Smithtolerance boomerAMG uses to decide if to stop before `max_iter` with
5909b92b1d3SBarry Smith`-pc_hypre_boomeramg_tol <tol>` (the default is 1.e-7). Run with `-ksp_view` to see
5919b92b1d3SBarry Smithall the hypre options used and `-help | grep boomeramg` to see all the command line
5929b92b1d3SBarry Smithoptions.
5939b92b1d3SBarry Smith
5949b92b1d3SBarry Smith### How do I use PETSc for Domain Decomposition?
5959b92b1d3SBarry Smith
5969b92b1d3SBarry SmithPETSc includes Additive Schwarz methods in the suite of preconditioners under the umbrella
5979b92b1d3SBarry Smithof `PCASM`. These may be activated with the runtime option `-pc_type asm`. Various
5989b92b1d3SBarry Smithother options may be set, including the degree of overlap `-pc_asm_overlap <number>` the
5999b92b1d3SBarry Smithtype of restriction/extension `-pc_asm_type [basic,restrict,interpolate,none]` sets ASM
6009b92b1d3SBarry Smithtype and several others. You may see the available ASM options by using `-pc_type asm
6019b92b1d3SBarry Smith-help`. See the procedural interfaces in the manual pages, for example `PCASMType()`
6029b92b1d3SBarry Smithand check the index of the users manual for `PCASMCreateSubdomains()`.
6039b92b1d3SBarry Smith
6049b92b1d3SBarry SmithPETSc also contains a domain decomposition inspired wirebasket or face based two level
6059b92b1d3SBarry Smithmethod where the coarse mesh to fine mesh interpolation is defined by solving specific
6069b92b1d3SBarry Smithlocal subdomain problems. It currently only works for 3D scalar problems on structured
6079b92b1d3SBarry Smithgrids created with PETSc `DMDA`. See the manual page for `PCEXOTIC` and
6089b92b1d3SBarry Smith`src/ksp/ksp/tutorials/ex45.c` for an example.
6099b92b1d3SBarry Smith
6109b92b1d3SBarry SmithPETSc also contains a balancing Neumann-Neumann type preconditioner, see the manual page
6119b92b1d3SBarry Smithfor `PCBDDC`. This requires matrices be constructed with `MatCreateIS()` via the finite
6129b92b1d3SBarry Smithelement method. See `src/ksp/ksp/tests/ex59.c` for an example.
6139b92b1d3SBarry Smith
6149b92b1d3SBarry Smith### You have `MATAIJ` and `MATBAIJ` matrix formats, and `MATSBAIJ` for symmetric storage, how come no `MATSAIJ`?
6159b92b1d3SBarry Smith
6169b92b1d3SBarry SmithJust for historical reasons; the `MATSBAIJ` format with blocksize one is just as efficient as
6179b92b1d3SBarry Smitha `MATSAIJ` would be.
6189b92b1d3SBarry Smith
6199b92b1d3SBarry Smith### Can I Create BAIJ matrices with different size blocks for different block rows?
6209b92b1d3SBarry Smith
6219b92b1d3SBarry SmithNo. The `MATBAIJ` format only supports a single fixed block size on the entire
6229b92b1d3SBarry Smithmatrix. But the `MATAIJ` format automatically searches for matching rows and thus still
6239b92b1d3SBarry Smithtakes advantage of the natural blocks in your matrix to obtain good performance.
6249b92b1d3SBarry Smith
6259b92b1d3SBarry Smith:::{note}
6269b92b1d3SBarry SmithIf you use `MATAIJ` you cannot use the `MatSetValuesBlocked()`.
6279b92b1d3SBarry Smith:::
6289b92b1d3SBarry Smith
6299b92b1d3SBarry Smith### How do I access the values of a remote parallel PETSc Vec?
6309b92b1d3SBarry Smith
6319b92b1d3SBarry Smith1. On each process create a local `Vec` large enough to hold all the values it wishes to
6329b92b1d3SBarry Smith   access.
6339b92b1d3SBarry Smith2. Create a `VecScatter` that scatters from the parallel `Vec` into the local `Vec`.
6349b92b1d3SBarry Smith3. Use `VecGetArray()` to access the values in the local `Vec`.
6359b92b1d3SBarry Smith
6369b92b1d3SBarry SmithFor example, assuming we have distributed a vector `vecGlobal` of size $N$ to
6379b92b1d3SBarry Smith$R$ ranks and each remote rank holds $N/R = m$ values (similarly assume that
6389b92b1d3SBarry Smith$N$ is cleanly divisible by $R$). We want each rank $r$ to gather the
6399b92b1d3SBarry Smithfirst $n$ (also assume $n \leq m$) values from its immediately superior neighbor
6409b92b1d3SBarry Smith$r+1$ (final rank will retrieve from rank 0).
6419b92b1d3SBarry Smith
6429b92b1d3SBarry Smith```
6439b92b1d3SBarry SmithVec            vecLocal;
6449b92b1d3SBarry SmithIS             isLocal, isGlobal;
6459b92b1d3SBarry SmithVecScatter     ctx;
6469b92b1d3SBarry SmithPetscScalar    *arr;
6479b92b1d3SBarry SmithPetscInt       N, firstGlobalIndex;
6489b92b1d3SBarry SmithMPI_Comm       comm;
6499b92b1d3SBarry SmithPetscMPIInt    r, R;
6509b92b1d3SBarry Smith
6519b92b1d3SBarry Smith/* Create sequential local vector, big enough to hold local portion */
6529b92b1d3SBarry SmithPetscCall(VecCreateSeq(PETSC_COMM_SELF, n, &vecLocal));
6539b92b1d3SBarry Smith
6549b92b1d3SBarry Smith/* Create IS to describe where we want to scatter to */
6559b92b1d3SBarry SmithPetscCall(ISCreateStride(PETSC_COMM_SELF, n, 0, 1, &isLocal));
6569b92b1d3SBarry Smith
6579b92b1d3SBarry Smith/* Compute the global indices */
6589b92b1d3SBarry SmithPetscCall(VecGetSize(vecGlobal, &N));
6599b92b1d3SBarry SmithPetscCall(PetscObjectGetComm((PetscObject) vecGlobal, &comm));
6609b92b1d3SBarry SmithPetscCallMPI(MPI_Comm_rank(comm, &r));
6619b92b1d3SBarry SmithPetscCallMPI(MPI_Comm_size(comm, &R));
6629b92b1d3SBarry SmithfirstGlobalIndex = r == R-1 ? 0 : (N/R)*(r+1);
6639b92b1d3SBarry Smith
6649b92b1d3SBarry Smith/* Create IS that describes where we want to scatter from */
6659b92b1d3SBarry SmithPetscCall(ISCreateStride(comm, n, firstGlobalIndex, 1, &isGlobal));
6669b92b1d3SBarry Smith
6679b92b1d3SBarry Smith/* Create the VecScatter context */
6689b92b1d3SBarry SmithPetscCall(VecScatterCreate(vecGlobal, isGlobal, vecLocal, isLocal, &ctx));
6699b92b1d3SBarry Smith
6709b92b1d3SBarry Smith/* Gather the values */
6719b92b1d3SBarry SmithPetscCall(VecScatterBegin(ctx, vecGlobal, vecLocal, INSERT_VALUES, SCATTER_FORWARD));
6729b92b1d3SBarry SmithPetscCall(VecScatterEnd(ctx, vecGlobal, vecLocal, INSERT_VALUES, SCATTER_FORWARD));
6739b92b1d3SBarry Smith
6749b92b1d3SBarry Smith/* Retrieve and do work */
6759b92b1d3SBarry SmithPetscCall(VecGetArray(vecLocal, &arr));
6769b92b1d3SBarry Smith/* Work */
6779b92b1d3SBarry SmithPetscCall(VecRestoreArray(vecLocal, &arr));
6789b92b1d3SBarry Smith
6799b92b1d3SBarry Smith/* Don't forget to clean up */
6809b92b1d3SBarry SmithPetscCall(ISDestroy(&isLocal));
6819b92b1d3SBarry SmithPetscCall(ISDestroy(&isGlobal));
6829b92b1d3SBarry SmithPetscCall(VecScatterDestroy(&ctx));
6839b92b1d3SBarry SmithPetscCall(VecDestroy(&vecLocal));
6849b92b1d3SBarry Smith```
6859b92b1d3SBarry Smith
6869b92b1d3SBarry Smith(doc_faq_usage_alltoone)=
6879b92b1d3SBarry Smith
6889b92b1d3SBarry Smith### How do I collect to a single processor all the values from a parallel PETSc Vec?
6899b92b1d3SBarry Smith
6909b92b1d3SBarry Smith1. Create the `VecScatter` context that will do the communication:
6919b92b1d3SBarry Smith
6929b92b1d3SBarry Smith   ```
6939b92b1d3SBarry Smith   Vec        in_par,out_seq;
6949b92b1d3SBarry Smith   VecScatter ctx;
6959b92b1d3SBarry Smith
6969b92b1d3SBarry Smith   PetscCall(VecScatterCreateToAll(in_par, &ctx, &out_seq));
6979b92b1d3SBarry Smith   ```
6989b92b1d3SBarry Smith
6999b92b1d3SBarry Smith2. Initiate the communication (this may be repeated if you wish):
7009b92b1d3SBarry Smith
7019b92b1d3SBarry Smith   ```
7029b92b1d3SBarry Smith   PetscCall(VecScatterBegin(ctx, in_par, out_seq, INSERT_VALUES, SCATTER_FORWARD));
7039b92b1d3SBarry Smith   PetscCall(VecScatterEnd(ctx, in_par, out_seq, INSERT_VALUES, SCATTER_FORWARD));
7049b92b1d3SBarry Smith   /* May destroy context now if no additional scatters are needed, otherwise reuse it */
7059b92b1d3SBarry Smith   PetscCall(VecScatterDestroy(&ctx));
7069b92b1d3SBarry Smith   ```
7079b92b1d3SBarry Smith
7089b92b1d3SBarry SmithNote that this simply concatenates in the parallel ordering of the vector (computed by the
7099b92b1d3SBarry Smith`MPI_Comm_rank` of the owning process). If you are using a `Vec` from
7109b92b1d3SBarry Smith`DMCreateGlobalVector()` you likely want to first call `DMDAGlobalToNaturalBegin()`
7119b92b1d3SBarry Smithfollowed by `DMDAGlobalToNaturalEnd()` to scatter the original `Vec` into the natural
7129b92b1d3SBarry Smithordering in a new global `Vec` before calling `VecScatterBegin()`/`VecScatterEnd()`
7139b92b1d3SBarry Smithto scatter the natural `Vec` onto all processes.
7149b92b1d3SBarry Smith
7159b92b1d3SBarry Smith### How do I collect all the values from a parallel PETSc Vec on the 0th rank?
7169b92b1d3SBarry Smith
7179b92b1d3SBarry SmithSee FAQ entry on collecting to {ref}`an arbitrary processor <doc_faq_usage_alltoone>`, but
7189b92b1d3SBarry Smithreplace
7199b92b1d3SBarry Smith
7209b92b1d3SBarry Smith```
7219b92b1d3SBarry SmithPetscCall(VecScatterCreateToAll(in_par, &ctx, &out_seq));
7229b92b1d3SBarry Smith```
7239b92b1d3SBarry Smith
7249b92b1d3SBarry Smithwith
7259b92b1d3SBarry Smith
7269b92b1d3SBarry Smith```
7279b92b1d3SBarry SmithPetscCall(VecScatterCreateToZero(in_par, &ctx, &out_seq));
7289b92b1d3SBarry Smith```
7299b92b1d3SBarry Smith
7309b92b1d3SBarry Smith:::{note}
7319b92b1d3SBarry SmithThe same ordering considerations as discussed in the aforementioned entry also apply
7329b92b1d3SBarry Smithhere.
7339b92b1d3SBarry Smith:::
7349b92b1d3SBarry Smith
7359b92b1d3SBarry Smith### How can I read in or write out a sparse matrix in Matrix Market, Harwell-Boeing, Slapc or other ASCII format?
7369b92b1d3SBarry Smith
7379b92b1d3SBarry SmithIf you can read or write your matrix using Python or MATLAB/Octave, `PetscBinaryIO`
7389b92b1d3SBarry Smithmodules are provided at `$PETSC_DIR/lib/petsc/bin` for each language that can assist
7399b92b1d3SBarry Smithwith reading and writing. If you just want to convert `MatrixMarket`, you can use:
7409b92b1d3SBarry Smith
7419b92b1d3SBarry Smith```console
7429b92b1d3SBarry Smith$ python -m $PETSC_DIR/lib/petsc/bin/PetscBinaryIO convert matrix.mtx
7439b92b1d3SBarry Smith```
7449b92b1d3SBarry Smith
7459b92b1d3SBarry SmithTo produce `matrix.petsc`.
7469b92b1d3SBarry Smith
7479b92b1d3SBarry SmithYou can also call the script directly or import it from your Python code. There is also a
7489b92b1d3SBarry Smith`PETScBinaryIO.jl` Julia package.
7499b92b1d3SBarry Smith
7509b92b1d3SBarry SmithFor other formats, either adapt one of the above libraries or see the examples in
7519b92b1d3SBarry Smith`$PETSC_DIR/src/mat/tests`, specifically `ex72.c` or `ex78.c`. You will likely need
7529b92b1d3SBarry Smithto modify the code slightly to match your required ASCII format.
7539b92b1d3SBarry Smith
7549b92b1d3SBarry Smith:::{note}
7559b92b1d3SBarry SmithNever read or write in parallel an ASCII matrix file.
7569b92b1d3SBarry Smith
7579b92b1d3SBarry SmithInstead read in sequentially with a standalone code based on `ex72.c` or `ex78.c`
7589b92b1d3SBarry Smiththen save the matrix with the binary viewer `PetscViewerBinaryOpen()` and load the
7599b92b1d3SBarry Smithmatrix in parallel in your "real" PETSc program with `MatLoad()`.
7609b92b1d3SBarry Smith
7619b92b1d3SBarry SmithFor writing save with the binary viewer and then load with the sequential code to store
7629b92b1d3SBarry Smithit as ASCII.
7639b92b1d3SBarry Smith:::
7649b92b1d3SBarry Smith
7659b92b1d3SBarry Smith### Does TSSetFromOptions(), SNESSetFromOptions(), or KSPSetFromOptions() reset all the parameters I previously set or how come do they not seem to work?
7669b92b1d3SBarry Smith
7679b92b1d3SBarry SmithIf `XXSetFromOptions()` is used (with `-xxx_type aaaa`) to change the type of the
7689b92b1d3SBarry Smithobject then all parameters associated with the previous type are removed. Otherwise it
7699b92b1d3SBarry Smithdoes not reset parameters.
7709b92b1d3SBarry Smith
7719b92b1d3SBarry Smith`TS/SNES/KSPSetXXX()` commands that set properties for a particular type of object (such
7729b92b1d3SBarry Smithas `KSPGMRESSetRestart()`) ONLY work if the object is ALREADY of that type. For example,
7739b92b1d3SBarry Smithwith
7749b92b1d3SBarry Smith
7759b92b1d3SBarry Smith```
7769b92b1d3SBarry SmithKSP ksp;
7779b92b1d3SBarry Smith
7789b92b1d3SBarry SmithPetscCall(KSPCreate(PETSC_COMM_WORLD, &ksp));
7799b92b1d3SBarry SmithPetscCall(KSPGMRESSetRestart(ksp, 10));
7809b92b1d3SBarry Smith```
7819b92b1d3SBarry Smith
7829b92b1d3SBarry Smiththe restart will be ignored since the type has not yet been set to `KSPGMRES`. To have
7839b92b1d3SBarry Smiththose values take effect you should do one of the following:
7849b92b1d3SBarry Smith
7859b92b1d3SBarry Smith- Allow setting the type from the command line, if it is not on the command line then the
7869b92b1d3SBarry Smith  default type is automatically set.
7879b92b1d3SBarry Smith
7889b92b1d3SBarry Smith```
7899b92b1d3SBarry Smith/* Create generic object */
7909b92b1d3SBarry SmithXXXCreate(..,&obj);
7919b92b1d3SBarry Smith/* Must set all settings here, or default */
7929b92b1d3SBarry SmithXXXSetFromOptions(obj);
7939b92b1d3SBarry Smith```
7949b92b1d3SBarry Smith
7959b92b1d3SBarry Smith- Hardwire the type in the code, but allow the user to override it via a subsequent
7969b92b1d3SBarry Smith  `XXXSetFromOptions()` call. This essentially allows the user to customize what the
7979b92b1d3SBarry Smith  "default" type to of the object.
7989b92b1d3SBarry Smith
7999b92b1d3SBarry Smith```
8009b92b1d3SBarry Smith/* Create generic object */
8019b92b1d3SBarry SmithXXXCreate(..,&obj);
8029b92b1d3SBarry Smith/* Set type directly */
8039b92b1d3SBarry SmithXXXSetYYYYY(obj,...);
8049b92b1d3SBarry Smith/* Can always change to different type */
8059b92b1d3SBarry SmithXXXSetFromOptions(obj);
8069b92b1d3SBarry Smith```
8079b92b1d3SBarry Smith
8089b92b1d3SBarry Smith### How do I compile and link my own PETSc application codes and can I use my own `makefile` or rules for compiling code, rather than PETSc's?
8099b92b1d3SBarry Smith
8109b92b1d3SBarry SmithSee the {ref}`section <sec_writing_application_codes>` of the users manual on writing
8119b92b1d3SBarry Smithapplication codes with PETSc.
8129b92b1d3SBarry Smith
8139b92b1d3SBarry Smith### Can I use CMake to build my own project that depends on PETSc?
8149b92b1d3SBarry Smith
8159b92b1d3SBarry SmithSee the {ref}`section <sec_writing_application_codes>` of the users manual on writing
8169b92b1d3SBarry Smithapplication codes with PETSc.
8179b92b1d3SBarry Smith
8189b92b1d3SBarry Smith### How can I put carriage returns in `PetscPrintf()` statements from Fortran?
8199b92b1d3SBarry Smith
8209b92b1d3SBarry SmithYou can use the same notation as in C, just put a `\n` in the string. Note that no other C
8219b92b1d3SBarry Smithformat instruction is supported.
8229b92b1d3SBarry Smith
8239b92b1d3SBarry SmithOr you can use the Fortran concatenation `//` and `char(10)`; for example `'some
8249b92b1d3SBarry Smithstring'//char(10)//'another string` on the next line.
8259b92b1d3SBarry Smith
8269b92b1d3SBarry Smith### How can I implement callbacks using C++ class methods?
8279b92b1d3SBarry Smith
8289b92b1d3SBarry SmithDeclare the class method static. Static methods do not have a `this` pointer, but the
8299b92b1d3SBarry Smith`void*` context parameter will usually be cast to a pointer to the class where it can
8309b92b1d3SBarry Smithserve the same function.
8319b92b1d3SBarry Smith
8329b92b1d3SBarry Smith:::{admonition} Remember
8339b92b1d3SBarry SmithAll PETSc callbacks return `PetscErrorCode`.
8349b92b1d3SBarry Smith:::
8359b92b1d3SBarry Smith
8369b92b1d3SBarry Smith### Everyone knows that when you code Newton's Method you should compute the function and its Jacobian at the same time. How can one do this in PETSc?
8379b92b1d3SBarry Smith
8389b92b1d3SBarry SmithThe update in Newton's method is computed as
8399b92b1d3SBarry Smith
8409b92b1d3SBarry Smith$$
8419b92b1d3SBarry Smithu^{n+1} = u^n - \lambda * \left[J(u^n)] * F(u^n) \right]^{\dagger}
8429b92b1d3SBarry Smith$$
8439b92b1d3SBarry Smith
8449b92b1d3SBarry SmithThe reason PETSc doesn't default to computing both the function and Jacobian at the same
8459b92b1d3SBarry Smithtime is:
8469b92b1d3SBarry Smith
8479b92b1d3SBarry Smith- In order to do the line search $F \left(u^n - \lambda * \text{step} \right)$ may
8489b92b1d3SBarry Smith  need to be computed for several $\lambda$. The Jacobian is not needed for each of
8499b92b1d3SBarry Smith  those and one does not know in advance which will be the final $\lambda$ until
8509b92b1d3SBarry Smith  after the function value is computed, so many extra Jacobians may be computed.
8519b92b1d3SBarry Smith- In the final step if $|| F(u^p)||$ satisfies the convergence criteria then a
8529b92b1d3SBarry Smith  Jacobian need not be computed.
8539b92b1d3SBarry Smith
8549b92b1d3SBarry SmithYou are free to have your `FormFunction()` compute as much of the Jacobian at that point
8559b92b1d3SBarry Smithas you like, keep the information in the user context (the final argument to
8569b92b1d3SBarry Smith`FormFunction()` and `FormJacobian()`) and then retrieve the information in your
8579b92b1d3SBarry Smith`FormJacobian()` function.
8589b92b1d3SBarry Smith
8599b92b1d3SBarry Smith### Computing the Jacobian or preconditioner is time consuming. Is there any way to compute it less often?
8609b92b1d3SBarry Smith
8619b92b1d3SBarry SmithPETSc has a variety of ways of lagging the computation of the Jacobian or the
8629b92b1d3SBarry Smithpreconditioner. They are documented in the manual page for `SNESComputeJacobian()`
8639b92b1d3SBarry Smithand in the {ref}`users manual <ch_snes>`:
8649b92b1d3SBarry Smith
8659b92b1d3SBarry Smith-s
8669b92b1d3SBarry Smith
8679b92b1d3SBarry Smithnes_lag_jacobian
8689b92b1d3SBarry Smith
8699b92b1d3SBarry Smith(`SNESSetLagJacobian()`) How often Jacobian is rebuilt (use -1 to
8709b92b1d3SBarry Smithnever rebuild, use -2 to rebuild the next time requested and then
8719b92b1d3SBarry Smithnever again).
8729b92b1d3SBarry Smith
8739b92b1d3SBarry Smith-s
8749b92b1d3SBarry Smith
8759b92b1d3SBarry Smithnes_lag_jacobian_persists
8769b92b1d3SBarry Smith
8779b92b1d3SBarry Smith(`SNESSetLagJacobianPersists()`) Forces lagging of Jacobian
8789b92b1d3SBarry Smiththrough multiple `SNES` solves , same as passing -2 to
8799b92b1d3SBarry Smith`-snes_lag_jacobian`. By default, each new `SNES` solve
8809b92b1d3SBarry Smithnormally triggers a recomputation of the Jacobian.
8819b92b1d3SBarry Smith
8829b92b1d3SBarry Smith-s
8839b92b1d3SBarry Smith
8849b92b1d3SBarry Smithnes_lag_preconditioner
8859b92b1d3SBarry Smith
8869b92b1d3SBarry Smith(`SNESSetLagPreconditioner()`) how often the preconditioner is
8879b92b1d3SBarry Smithrebuilt. Note: if you are lagging the Jacobian the system will
8889b92b1d3SBarry Smithknow that the matrix has not changed and will not recompute the
8899b92b1d3SBarry Smith(same) preconditioner.
8909b92b1d3SBarry Smith
8919b92b1d3SBarry Smith-s
8929b92b1d3SBarry Smith
8939b92b1d3SBarry Smithnes_lag_preconditioner_persists
8949b92b1d3SBarry Smith
8959b92b1d3SBarry Smith(`SNESSetLagPreconditionerPersists()`) Preconditioner
8969b92b1d3SBarry Smithlags through multiple `SNES` solves
8979b92b1d3SBarry Smith
8989b92b1d3SBarry Smith:::{note}
8999b92b1d3SBarry SmithThese are often (but does not need to be) used in combination with
9009b92b1d3SBarry Smith`-snes_mf_operator` which applies the fresh Jacobian matrix-free for every
9019b92b1d3SBarry Smithmatrix-vector product. Otherwise the out-of-date matrix vector product, computed with
9029b92b1d3SBarry Smiththe lagged Jacobian will be used.
9039b92b1d3SBarry Smith:::
9049b92b1d3SBarry Smith
9059b92b1d3SBarry SmithBy using `KSPMonitorSet()` and/or `SNESMonitorSet()` one can provide code that monitors the
9069b92b1d3SBarry Smithconvergence rate and automatically triggers an update of the Jacobian or preconditioner
9079b92b1d3SBarry Smithbased on decreasing convergence of the iterative method. For example if the number of `SNES`
9089b92b1d3SBarry Smithiterations doubles one might trigger a new computation of the Jacobian. Experimentation is
9099b92b1d3SBarry Smiththe only general purpose way to determine which approach is best for your problem.
9109b92b1d3SBarry Smith
9119b92b1d3SBarry Smith:::{important}
9129b92b1d3SBarry SmithIt is also vital to experiment on your true problem at the scale you will be solving
9139b92b1d3SBarry Smiththe problem since the performance benefits depend on the exact problem and the problem
9149b92b1d3SBarry Smithsize!
9159b92b1d3SBarry Smith:::
9169b92b1d3SBarry Smith
9179b92b1d3SBarry Smith### How can I use Newton's Method Jacobian free? Can I difference a different function than provided with `SNESSetFunction()`?
9189b92b1d3SBarry Smith
9199b92b1d3SBarry SmithThe simplest way is with the option `-snes_mf`, this will use finite differencing of the
9209b92b1d3SBarry Smithfunction provided to `SNESComputeFunction()` to approximate the action of Jacobian.
9219b92b1d3SBarry Smith
9229b92b1d3SBarry Smith:::{important}
9239b92b1d3SBarry SmithSince no matrix-representation of the Jacobian is provided the `-pc_type` used with
9249b92b1d3SBarry Smiththis option must be `-pc_type none`. You may provide a custom preconditioner with
9259b92b1d3SBarry Smith`SNESGetKSP()`, `KSPGetPC()`, and `PCSetType()` and use `PCSHELL`.
9269b92b1d3SBarry Smith:::
9279b92b1d3SBarry Smith
9289b92b1d3SBarry SmithThe option `-snes_mf_operator` will use Jacobian free to apply the Jacobian (in the
9299b92b1d3SBarry SmithKrylov solvers) but will use whatever matrix you provided with `SNESSetJacobian()`
9309b92b1d3SBarry Smith(assuming you set one) to compute the preconditioner.
9319b92b1d3SBarry Smith
9329b92b1d3SBarry SmithTo write the code (rather than use the options above) use `MatCreateSNESMF()` and pass
9339b92b1d3SBarry Smiththe resulting matrix object to `SNESSetJacobian()`.
9349b92b1d3SBarry Smith
9359b92b1d3SBarry SmithFor purely matrix-free (like `-snes_mf`) pass the matrix object for both matrix
9369b92b1d3SBarry Smitharguments and pass the function `MatMFFDComputeJacobian()`.
9379b92b1d3SBarry Smith
9389b92b1d3SBarry SmithTo provide your own approximate Jacobian matrix to compute the preconditioner (like
9399b92b1d3SBarry Smith`-snes_mf_operator`), pass this other matrix as the second matrix argument to
9409b92b1d3SBarry Smith`SNESSetJacobian()`. Make sure your provided `computejacobian()` function calls
9419b92b1d3SBarry Smith`MatAssemblyBegin()` and `MatAssemblyEnd()` separately on **BOTH** matrix arguments
9429b92b1d3SBarry Smithto this function. See `src/snes/tests/ex7.c`.
9439b92b1d3SBarry Smith
9449b92b1d3SBarry SmithTo difference a different function than that passed to `SNESSetJacobian()` to compute the
9459b92b1d3SBarry Smithmatrix-free Jacobian multiply call `MatMFFDSetFunction()` to set that other function. See
9469b92b1d3SBarry Smith`src/snes/tests/ex7.c.h`.
9479b92b1d3SBarry Smith
9489b92b1d3SBarry Smith(doc_faq_usage_condnum)=
9499b92b1d3SBarry Smith
9509b92b1d3SBarry Smith### How can I determine the condition number of a matrix?
9519b92b1d3SBarry Smith
9529b92b1d3SBarry SmithFor small matrices, the condition number can be reliably computed using
9539b92b1d3SBarry Smith
9549b92b1d3SBarry Smith```text
9559b92b1d3SBarry Smith-pc_type svd -pc_svd_monitor
9569b92b1d3SBarry Smith```
9579b92b1d3SBarry Smith
9589b92b1d3SBarry SmithFor larger matrices, you can run with
9599b92b1d3SBarry Smith
9609b92b1d3SBarry Smith```text
9619b92b1d3SBarry Smith-pc_type none -ksp_type gmres -ksp_monitor_singular_value -ksp_gmres_restart 1000
9629b92b1d3SBarry Smith```
9639b92b1d3SBarry Smith
9649b92b1d3SBarry Smithto get approximations to the condition number of the operator. This will generally be
9659b92b1d3SBarry Smithaccurate for the largest singular values, but may overestimate the smallest singular value
9669b92b1d3SBarry Smithunless the method has converged. Make sure to avoid restarts. To estimate the condition
9679b92b1d3SBarry Smithnumber of the preconditioned operator, use `-pc_type somepc` in the last command.
9689b92b1d3SBarry Smith
9699b92b1d3SBarry SmithYou can use [SLEPc](https://slepc.upv.es) for highly scalable, efficient, and quality eigenvalue computations.
9709b92b1d3SBarry Smith
9719b92b1d3SBarry Smith### How can I compute the inverse of a matrix in PETSc?
9729b92b1d3SBarry Smith
9739b92b1d3SBarry Smith:::{admonition} Are you sure?
9749b92b1d3SBarry Smith:class: yellow
9759b92b1d3SBarry Smith
9769b92b1d3SBarry SmithIt is very expensive to compute the inverse of a matrix and very rarely needed in
9779b92b1d3SBarry Smithpractice. We highly recommend avoiding algorithms that need it.
9789b92b1d3SBarry Smith:::
9799b92b1d3SBarry Smith
9809b92b1d3SBarry SmithThe inverse of a matrix (dense or sparse) is essentially always dense, so begin by
9819b92b1d3SBarry Smithcreating a dense matrix B and fill it with the identity matrix (ones along the diagonal),
9829b92b1d3SBarry Smithalso create a dense matrix X of the same size that will hold the solution. Then factor the
9839b92b1d3SBarry Smithmatrix you wish to invert with `MatLUFactor()` or `MatCholeskyFactor()`, call the
9849b92b1d3SBarry Smithresult A. Then call `MatMatSolve(A,B,X)` to compute the inverse into X. See also section
9859b92b1d3SBarry Smithon {any}`Schur's complement <how_can_i_compute_the_schur_complement>`.
9869b92b1d3SBarry Smith
9879b92b1d3SBarry Smith(how_can_i_compute_the_schur_complement)=
9889b92b1d3SBarry Smith
9899b92b1d3SBarry Smith### How can I compute the Schur complement in PETSc?
9909b92b1d3SBarry Smith
9919b92b1d3SBarry Smith:::{admonition} Are you sure?
9929b92b1d3SBarry Smith:class: yellow
9939b92b1d3SBarry Smith
9949b92b1d3SBarry SmithIt is very expensive to compute the Schur complement of a matrix and very rarely needed
9959b92b1d3SBarry Smithin practice. We highly recommend avoiding algorithms that need it.
9969b92b1d3SBarry Smith:::
9979b92b1d3SBarry Smith
9989b92b1d3SBarry SmithThe Schur complement of the matrix $M \in \mathbb{R}^{\left(p+q \right) \times
9999b92b1d3SBarry Smith\left(p + q \right)}$
10009b92b1d3SBarry Smith
10019b92b1d3SBarry Smith$$
10029b92b1d3SBarry SmithM = \begin{bmatrix}
10039b92b1d3SBarry SmithA & B \\
10049b92b1d3SBarry SmithC & D
10059b92b1d3SBarry Smith\end{bmatrix}
10069b92b1d3SBarry Smith$$
10079b92b1d3SBarry Smith
10089b92b1d3SBarry Smithwhere
10099b92b1d3SBarry Smith
10109b92b1d3SBarry Smith$$
10119b92b1d3SBarry SmithA \in \mathbb{R}^{p \times p}, \quad B \in \mathbb{R}^{p \times q}, \quad C \in \mathbb{R}^{q \times p}, \quad D \in \mathbb{R}^{q \times q}
10129b92b1d3SBarry Smith$$
10139b92b1d3SBarry Smith
10149b92b1d3SBarry Smithis given by
10159b92b1d3SBarry Smith
10169b92b1d3SBarry Smith$$
10179b92b1d3SBarry SmithS_D := A - BD^{-1}C \\
10189b92b1d3SBarry SmithS_A := D - CA^{-1}B
10199b92b1d3SBarry Smith$$
10209b92b1d3SBarry Smith
10219b92b1d3SBarry SmithLike the inverse, the Schur complement of a matrix (dense or sparse) is essentially always
10229b92b1d3SBarry Smithdense, so assuming you wish to calculate $S_A = D - C \underbrace{
10239b92b1d3SBarry Smith\overbrace{(A^{-1})}^{U} B}_{V}$ begin by:
10249b92b1d3SBarry Smith
10259b92b1d3SBarry Smith1. Forming a dense matrix $B$
10269b92b1d3SBarry Smith2. Also create another dense matrix $V$ of the same size.
10279b92b1d3SBarry Smith3. Then either factor the matrix $A$ directly with `MatLUFactor()` or
10289b92b1d3SBarry Smith   `MatCholeskyFactor()`, or use `MatGetFactor()` followed by
10299b92b1d3SBarry Smith   `MatLUFactorSymbolic()` followed by `MatLUFactorNumeric()` if you wish to use and
10309b92b1d3SBarry Smith   external solver package like SuperLU_Dist. Call the result $U$.
10319b92b1d3SBarry Smith4. Then call `MatMatSolve(U,B,V)`.
10329b92b1d3SBarry Smith5. Then call `MatMatMult(C,V,MAT_INITIAL_MATRIX,1.0,&S)`.
10339b92b1d3SBarry Smith6. Now call `MatAXPY(S,-1.0,D,MAT_SUBSET_NONZERO)`.
10349b92b1d3SBarry Smith7. Followed by `MatScale(S,-1.0)`.
10359b92b1d3SBarry Smith
10369b92b1d3SBarry SmithFor computing Schur complements like this it does not make sense to use the `KSP`
10379b92b1d3SBarry Smithiterative solvers since for solving many moderate size problems using a direct
10389b92b1d3SBarry Smithfactorization is much faster than iterative solvers. As you can see, this requires a great
10399b92b1d3SBarry Smithdeal of work space and computation so is best avoided.
10409b92b1d3SBarry Smith
10419b92b1d3SBarry SmithHowever, it is not necessary to assemble the Schur complement $S$ in order to solve
10429b92b1d3SBarry Smithsystems with it. Use `MatCreateSchurComplement(A,A_pre,B,C,D,&S)` to create a
10439b92b1d3SBarry Smithmatrix that applies the action of $S$ (using `A_pre` to solve with `A`), but
10449b92b1d3SBarry Smithdoes not assemble.
10459b92b1d3SBarry Smith
10469b92b1d3SBarry SmithAlternatively, if you already have a block matrix `M = [A, B; C, D]` (in some
10479b92b1d3SBarry Smithordering), then you can create index sets (`IS`) `isa` and `isb` to address each
10489b92b1d3SBarry Smithblock, then use `MatGetSchurComplement()` to create the Schur complement and/or an
10499b92b1d3SBarry Smithapproximation suitable for preconditioning.
10509b92b1d3SBarry Smith
10519b92b1d3SBarry SmithSince $S$ is generally dense, standard preconditioning methods cannot typically be
10529b92b1d3SBarry Smithapplied directly to Schur complements. There are many approaches to preconditioning Schur
10539b92b1d3SBarry Smithcomplements including using the `SIMPLE` approximation
10549b92b1d3SBarry Smith
10559b92b1d3SBarry Smith$$
10569b92b1d3SBarry SmithD - C \text{diag}(A)^{-1} B
10579b92b1d3SBarry Smith$$
10589b92b1d3SBarry Smith
10599b92b1d3SBarry Smithto create a sparse matrix that approximates the Schur complement (this is returned by
10609b92b1d3SBarry Smithdefault for the optional "preconditioning" matrix in `MatGetSchurComplement()`).
10619b92b1d3SBarry Smith
10629b92b1d3SBarry SmithAn alternative is to interpret the matrices as differential operators and apply
10639b92b1d3SBarry Smithapproximate commutator arguments to find a spectrally equivalent operation that can be
10649b92b1d3SBarry Smithapplied efficiently (see the "PCD" preconditioners {cite}`elman_silvester_wathen_2014`). A
10659b92b1d3SBarry Smithvariant of this is the least squares commutator, which is closely related to the
10669b92b1d3SBarry SmithMoore-Penrose pseudoinverse, and is available in `PCLSC` which operates on matrices of
10679b92b1d3SBarry Smithtype `MATSCHURCOMPLEMENT`.
10689b92b1d3SBarry Smith
10699b92b1d3SBarry Smith### Do you have examples of doing unstructured grid Finite Element Method (FEM) with PETSc?
10709b92b1d3SBarry Smith
10719b92b1d3SBarry SmithThere are at least three ways to write finite element codes using PETSc:
10729b92b1d3SBarry Smith
10739b92b1d3SBarry Smith1. Use `DMPLEX`, which is a high level approach to manage your mesh and
10749b92b1d3SBarry Smith   discretization. See the {ref}`tutorials sections <tut_stokes>` for further information,
10759b92b1d3SBarry Smith   or see `src/snes/tutorial/ex62.c`.
10769b92b1d3SBarry Smith2. Use packages such as [deal.ii](https://www.dealii.org), [libMesh](https://libmesh.github.io), or
10779b92b1d3SBarry Smith   [Firedrake](https://www.firedrakeproject.org), which use PETSc for the solvers.
10789b92b1d3SBarry Smith3. Manage the grid data structure yourself and use PETSc `PetscSF`, `IS`, and `VecScatter` to
10799b92b1d3SBarry Smith   communicate the required ghost point communication. See
10809b92b1d3SBarry Smith   `src/snes/tutorials/ex10d/ex10.c`.
10819b92b1d3SBarry Smith
10829b92b1d3SBarry Smith### DMDA decomposes the domain differently than the MPI_Cart_create() command. How can one use them together?
10839b92b1d3SBarry Smith
10849b92b1d3SBarry SmithThe `MPI_Cart_create()` first divides the mesh along the z direction, then the y, then
10859b92b1d3SBarry Smiththe x. `DMDA` divides along the x, then y, then z. Thus, for example, rank 1 of the
10869b92b1d3SBarry Smithprocesses will be in a different part of the mesh for the two schemes. To resolve this you
10879b92b1d3SBarry Smithcan create a new MPI communicator that you pass to `DMDACreate()` that renumbers the
10889b92b1d3SBarry Smithprocess ranks so that each physical process shares the same part of the mesh with both the
10899b92b1d3SBarry Smith`DMDA` and the `MPI_Cart_create()`. The code to determine the new numbering was
10909b92b1d3SBarry Smithprovided by Rolf Kuiper:
10919b92b1d3SBarry Smith
10929b92b1d3SBarry Smith```
10939b92b1d3SBarry Smith// the numbers of processors per direction are (int) x_procs, y_procs, z_procs respectively
10949b92b1d3SBarry Smith// (no parallelization in direction 'dir' means dir_procs = 1)
10959b92b1d3SBarry Smith
10969b92b1d3SBarry SmithMPI_Comm    NewComm;
10979b92b1d3SBarry Smithint         x, y, z;
10989b92b1d3SBarry SmithPetscMPIInt MPI_Rank, NewRank;
10999b92b1d3SBarry Smith
11009b92b1d3SBarry Smith// get rank from MPI ordering:
11019b92b1d3SBarry SmithPetscCallMPI(MPI_Comm_rank(MPI_COMM_WORLD, &MPI_Rank));
11029b92b1d3SBarry Smith
11039b92b1d3SBarry Smith// calculate coordinates of cpus in MPI ordering:
11049b92b1d3SBarry Smithx = MPI_rank / (z_procs*y_procs);
11059b92b1d3SBarry Smithy = (MPI_rank % (z_procs*y_procs)) / z_procs;
11069b92b1d3SBarry Smithz = (MPI_rank % (z_procs*y_procs)) % z_procs;
11079b92b1d3SBarry Smith
11089b92b1d3SBarry Smith// set new rank according to PETSc ordering:
11099b92b1d3SBarry SmithNewRank = z*y_procs*x_procs + y*x_procs + x;
11109b92b1d3SBarry Smith
11119b92b1d3SBarry Smith// create communicator with new ranks according to PETSc ordering
11129b92b1d3SBarry SmithPetscCallMPI(MPI_Comm_split(PETSC_COMM_WORLD, 1, NewRank, &NewComm));
11139b92b1d3SBarry Smith
11149b92b1d3SBarry Smith// override the default communicator (was MPI_COMM_WORLD as default)
11159b92b1d3SBarry SmithPETSC_COMM_WORLD = NewComm;
11169b92b1d3SBarry Smith```
11179b92b1d3SBarry Smith
11189b92b1d3SBarry Smith### When solving a system with Dirichlet boundary conditions I can use MatZeroRows() to eliminate the Dirichlet rows but this results in a non-Symmetric system. How can I apply Dirichlet boundary conditions but keep the matrix symmetric?
11199b92b1d3SBarry Smith
11209b92b1d3SBarry Smith- For nonsymmetric systems put the appropriate boundary solutions in the x vector and use
11219b92b1d3SBarry Smith  `MatZeroRows()` followed by `KSPSetOperators()`.
11229b92b1d3SBarry Smith- For symmetric problems use `MatZeroRowsColumns()`.
11239b92b1d3SBarry Smith- If you have many Dirichlet locations you can use `MatZeroRows()` (**not**
11249b92b1d3SBarry Smith  `MatZeroRowsColumns()`) and `-ksp_type preonly -pc_type redistribute` (see
11259b92b1d3SBarry Smith  `PCREDISTRIBUTE`) and PETSc will repartition the parallel matrix for load
11269b92b1d3SBarry Smith  balancing. In this case the new matrix solved remains symmetric even though
11279b92b1d3SBarry Smith  `MatZeroRows()` is used.
11289b92b1d3SBarry Smith
11299b92b1d3SBarry SmithAn alternative approach is, when assembling the matrix (generating values and passing
11309b92b1d3SBarry Smiththem to the matrix), never to include locations for the Dirichlet grid points in the
11319b92b1d3SBarry Smithvector and matrix, instead taking them into account as you put the other values into the
11329b92b1d3SBarry Smithload.
11339b92b1d3SBarry Smith
11349b92b1d3SBarry Smith### How can I get PETSc vectors and matrices to MATLAB or vice versa?
11359b92b1d3SBarry Smith
11369b92b1d3SBarry SmithThere are numerous ways to work with PETSc and MATLAB. All but the first approach
11379b92b1d3SBarry Smithrequire PETSc to be configured with --with-matlab.
11389b92b1d3SBarry Smith
11399b92b1d3SBarry Smith1. To save PETSc `Mat` and `Vec` to files that can be read from MATLAB use
11409b92b1d3SBarry Smith   `PetscViewerBinaryOpen()` viewer and `VecView()` or `MatView()` to save objects
11419b92b1d3SBarry Smith   for MATLAB and `VecLoad()` and `MatLoad()` to get the objects that MATLAB has
11429b92b1d3SBarry Smith   saved. See `share/petsc/matlab/PetscBinaryRead.m` and
11439b92b1d3SBarry Smith   `share/petsc/matlab/PetscBinaryWrite.m` for loading and saving the objects in MATLAB.
11449b92b1d3SBarry Smith2. Using the [MATLAB Engine](https://www.mathworks.com/help/matlab/calling-matlab-engine-from-c-programs-1.html),
11459b92b1d3SBarry Smith   allows PETSc to automatically call MATLAB to perform some specific computations. This
11469b92b1d3SBarry Smith   does not allow MATLAB to be used interactively by the user. See the
11479b92b1d3SBarry Smith   `PetscMatlabEngine`.
11489b92b1d3SBarry Smith3. You can open a socket connection between MATLAB and PETSc to allow sending objects back
11499b92b1d3SBarry Smith   and forth between an interactive MATLAB session and a running PETSc program. See
11509b92b1d3SBarry Smith   `PetscViewerSocketOpen()` for access from the PETSc side and
11519b92b1d3SBarry Smith   `share/petsc/matlab/PetscReadBinary.m` for access from the MATLAB side.
11529b92b1d3SBarry Smith4. You can save PETSc `Vec` (**not** `Mat`) with the `PetscViewerMatlabOpen()`
11539b92b1d3SBarry Smith   viewer that saves `.mat` files can then be loaded into MATLAB using the `load()` command
11549b92b1d3SBarry Smith
11559b92b1d3SBarry Smith### How do I get started with Cython so that I can extend petsc4py?
11569b92b1d3SBarry Smith
11579b92b1d3SBarry Smith1. Learn how to [build a Cython module](http://docs.cython.org/src/quickstart/build.html).
11589b92b1d3SBarry Smith2. Go through the simple [example](https://stackoverflow.com/questions/3046305/simple-wrapping-of-c-code-with-cython). Note
11599b92b1d3SBarry Smith   also the next comment that shows how to create numpy arrays in the Cython and pass them
11609b92b1d3SBarry Smith   back.
11619b92b1d3SBarry Smith3. Check out [this page](http://docs.cython.org/src/tutorial/numpy.html) which tells
11629b92b1d3SBarry Smith   you how to get fast indexing.
11639b92b1d3SBarry Smith4. Have a look at the petsc4py [array source](http://code.google.com/p/petsc4py/source/browse/src/PETSc/arraynpy.pxi).
11649b92b1d3SBarry Smith
11659b92b1d3SBarry Smith### How do I compute a custom norm for KSP to use as a convergence test or for monitoring?
11669b92b1d3SBarry Smith
11679b92b1d3SBarry SmithYou need to call `KSPBuildResidual()` on your `KSP` object and then compute the
11689b92b1d3SBarry Smithappropriate norm on the resulting residual. Note that depending on the
11699b92b1d3SBarry Smith`KSPSetNormType()` of the method you may not return the same norm as provided by the
11709b92b1d3SBarry Smithmethod. See also `KSPSetPCSide()`.
11719b92b1d3SBarry Smith
11729b92b1d3SBarry Smith### If I have a sequential program can I use a PETSc parallel solver?
11739b92b1d3SBarry Smith
11749b92b1d3SBarry Smith:::{important}
11759b92b1d3SBarry SmithDo not expect to get great speedups! Much of the speedup gained by using parallel
11769b92b1d3SBarry Smithsolvers comes from building the underlying matrices and vectors in parallel to begin
11779b92b1d3SBarry Smithwith. You should see some reduction in the time for the linear solvers.
11789b92b1d3SBarry Smith:::
11799b92b1d3SBarry Smith
11809b92b1d3SBarry SmithYes, you must set up PETSc with MPI (even though you will not use MPI) with at least the
11819b92b1d3SBarry Smithfollowing options:
11829b92b1d3SBarry Smith
11839b92b1d3SBarry Smith```console
11849b92b1d3SBarry Smith$ ./configure --download-superlu_dist --download-parmetis --download-metis --with-openmp
11859b92b1d3SBarry Smith```
11869b92b1d3SBarry Smith
11879b92b1d3SBarry SmithYour compiler must support OpenMP. To have the linear solver run in parallel run your
11889b92b1d3SBarry Smithprogram with
11899b92b1d3SBarry Smith
11909b92b1d3SBarry Smith```console
11919b92b1d3SBarry Smith$ OMP_NUM_THREADS=n ./myprog -pc_type lu -pc_factor_mat_solver superlu_dist
11929b92b1d3SBarry Smith```
11939b92b1d3SBarry Smith
11949b92b1d3SBarry Smithwhere `n` is the number of threads and should be less than or equal to the number of cores
11959b92b1d3SBarry Smithavailable.
11969b92b1d3SBarry Smith
11979b92b1d3SBarry Smith:::{note}
11989b92b1d3SBarry SmithIf your code is MPI parallel you can also use these same options to have SuperLU_dist
11999b92b1d3SBarry Smithutilize multiple threads per MPI process for the direct solver. Make sure that the
12009b92b1d3SBarry Smith`$OMP_NUM_THREADS` you use per MPI process is less than or equal to the number of
12019b92b1d3SBarry Smithcores available for each MPI process. For example if your compute nodes have 6 cores
12029b92b1d3SBarry Smithand you use 2 MPI processes per node then set `$OMP_NUM_THREADS` to 2 or 3.
12039b92b1d3SBarry Smith:::
12049b92b1d3SBarry Smith
12059b92b1d3SBarry SmithAnother approach that allows using a PETSc parallel solver is to use `PCMPI`.
12069b92b1d3SBarry Smith
12079b92b1d3SBarry Smith### TS or SNES produces infeasible (out of domain) solutions or states. How can I prevent this?
12089b92b1d3SBarry Smith
12099b92b1d3SBarry SmithFor `TS` call `TSSetFunctionDomainError()`. For both `TS` and `SNES` call
12109b92b1d3SBarry Smith`SNESSetFunctionDomainError()` when the solver passes an infeasible (out of domain)
12119b92b1d3SBarry Smithsolution or state to your routines.
12129b92b1d3SBarry Smith
12139b92b1d3SBarry SmithIf it occurs for DAEs, it is important to insure the algebraic constraints are well
12149b92b1d3SBarry Smithsatisfied, which can prevent "breakdown" later. Thus, one can try using a tight tolerance
12159b92b1d3SBarry Smithfor `SNES`, using a direct linear solver (`PCType` of `PCLU`) when possible, and reducing the timestep (or
12169b92b1d3SBarry Smithtightening `TS` tolerances for adaptive time stepping).
12179b92b1d3SBarry Smith
12189b92b1d3SBarry Smith### Can PETSc work with Hermitian matrices?
12199b92b1d3SBarry Smith
12209b92b1d3SBarry SmithPETSc's support of Hermitian matrices is limited. Many operations and solvers work
12219b92b1d3SBarry Smithfor symmetric (`MATSBAIJ`) matrices and operations on transpose matrices but there is
12229b92b1d3SBarry Smithlittle direct support for Hermitian matrices and Hermitian transpose (complex conjugate
12239b92b1d3SBarry Smithtranspose) operations. There is `KSPSolveTranspose()` for solving the transpose of a
12249b92b1d3SBarry Smithlinear system but no `KSPSolveHermitian()`.
12259b92b1d3SBarry Smith
12269b92b1d3SBarry SmithFor creating known Hermitian matrices:
12279b92b1d3SBarry Smith
12289b92b1d3SBarry Smith- `MatCreateNormalHermitian()`
12299b92b1d3SBarry Smith- `MatCreateHermitianTranspose()`
12309b92b1d3SBarry Smith
12319b92b1d3SBarry SmithFor determining or setting Hermitian status on existing matrices:
12329b92b1d3SBarry Smith
12339b92b1d3SBarry Smith- `MatIsHermitian()`
12349b92b1d3SBarry Smith- `MatIsHermitianKnown()`
12359b92b1d3SBarry Smith- `MatIsStructurallySymmetric()`
12369b92b1d3SBarry Smith- `MatIsSymmetricKnown()`
12379b92b1d3SBarry Smith- `MatIsSymmetric()`
12389b92b1d3SBarry Smith- `MatSetOption()` (use with `MAT_SYMMETRIC` or `MAT_HERMITIAN` to assert to PETSc
12399b92b1d3SBarry Smith  that either is the case).
12409b92b1d3SBarry Smith
12419b92b1d3SBarry SmithFor performing matrix operations on known Hermitian matrices (note that regular `Mat`
12429b92b1d3SBarry Smithfunctions such as `MatMult()` will of course also work on Hermitian matrices):
12439b92b1d3SBarry Smith
12449b92b1d3SBarry Smith- `MatMultHermitianTranspose()`
12459b92b1d3SBarry Smith- `MatMultHermitianTransposeAdd()` (very limited support)
12469b92b1d3SBarry Smith
12479b92b1d3SBarry Smith### How can I assemble a bunch of similar matrices?
12489b92b1d3SBarry Smith
12499b92b1d3SBarry SmithYou can first add the values common to all the matrices, then use `MatStoreValues()` to
12509b92b1d3SBarry Smithstash the common values. Each iteration you call `MatRetrieveValues()`, then set the
12519b92b1d3SBarry Smithunique values in a matrix and assemble.
12529b92b1d3SBarry Smith
12539b92b1d3SBarry Smith### Can one resize or change the size of PETSc matrices or vectors?
12549b92b1d3SBarry Smith
12559b92b1d3SBarry SmithNo, once the vector or matrices sizes have been set and the matrices or vectors are fully
12569b92b1d3SBarry Smithusable one cannot change the size of the matrices or vectors or number of processors they
12579b92b1d3SBarry Smithlive on. One may create new vectors and copy, for example using `VecScatterCreate()`,
12589b92b1d3SBarry Smiththe old values from the previous vector.
12599b92b1d3SBarry Smith
12609b92b1d3SBarry Smith### How can one compute the nullspace of a sparse matrix with MUMPS?
12619b92b1d3SBarry Smith
12629b92b1d3SBarry SmithAssuming you have an existing matrix $A$ whose nullspace $V$ you want to find:
12639b92b1d3SBarry Smith
12649b92b1d3SBarry Smith```
12659b92b1d3SBarry SmithMat      F, work, V;
12669b92b1d3SBarry SmithPetscInt N, rows;
12679b92b1d3SBarry Smith
12689b92b1d3SBarry Smith/* Determine factorability */
12699b92b1d3SBarry SmithPetscCall(MatGetFactor(A, MATSOLVERMUMPS, MAT_FACTOR_LU, &F));
12709b92b1d3SBarry SmithPetscCall(MatGetLocalSize(A, &rows, NULL));
12719b92b1d3SBarry Smith
12729b92b1d3SBarry Smith/* Set MUMPS options, see MUMPS documentation for more information */
12739b92b1d3SBarry SmithPetscCall(MatMumpsSetIcntl(F, 24, 1));
12749b92b1d3SBarry SmithPetscCall(MatMumpsSetIcntl(F, 25, 1));
12759b92b1d3SBarry Smith
12769b92b1d3SBarry Smith/* Perform factorization */
12779b92b1d3SBarry SmithPetscCall(MatLUFactorSymbolic(F, A, NULL, NULL, NULL));
12789b92b1d3SBarry SmithPetscCall(MatLUFactorNumeric(F, A, NULL));
12799b92b1d3SBarry Smith
12809b92b1d3SBarry Smith/* This is the dimension of the null space */
12819b92b1d3SBarry SmithPetscCall(MatMumpsGetInfog(F, 28, &N));
12829b92b1d3SBarry Smith
12839b92b1d3SBarry Smith/* This will contain the null space in the columns */
12849b92b1d3SBarry SmithPetscCall(MatCreateDense(comm, rows, N, PETSC_DETERMINE, PETSC_DETERMINE, NULL, &V));
12859b92b1d3SBarry Smith
12869b92b1d3SBarry SmithPetscCall(MatDuplicate(V, MAT_DO_NOT_COPY_VALUES, &work));
12879b92b1d3SBarry SmithPetscCall(MatMatSolve(F, work, V));
12889b92b1d3SBarry Smith```
12899b92b1d3SBarry Smith
12909b92b1d3SBarry Smith______________________________________________________________________
12919b92b1d3SBarry Smith
12929b92b1d3SBarry Smith## Execution
12939b92b1d3SBarry Smith
12949b92b1d3SBarry Smith### PETSc executables are SO big and take SO long to link
12959b92b1d3SBarry Smith
12969b92b1d3SBarry Smith:::{note}
12979b92b1d3SBarry SmithSee {ref}`shared libraries section <doc_faq_sharedlibs>` for more information.
12989b92b1d3SBarry Smith:::
12999b92b1d3SBarry Smith
13009b92b1d3SBarry SmithWe find this annoying as well. On most machines PETSc can use shared libraries, so
13019b92b1d3SBarry Smithexecutables should be much smaller, run `configure` with the additional option
13029b92b1d3SBarry Smith`--with-shared-libraries` (this is the default). Also, if you have room, compiling and
13039b92b1d3SBarry Smithlinking PETSc on your machine's `/tmp` disk or similar local disk, rather than over the
13049b92b1d3SBarry Smithnetwork will be much faster.
13059b92b1d3SBarry Smith
13069b92b1d3SBarry Smith### How does PETSc's `-help` option work? Why is it different for different programs?
13079b92b1d3SBarry Smith
13089b92b1d3SBarry SmithThere are 2 ways in which one interacts with the options database:
13099b92b1d3SBarry Smith
13109b92b1d3SBarry Smith- `PetscOptionsGetXXX()` where `XXX` is some type or data structure (for example
13119b92b1d3SBarry Smith  `PetscOptionsGetBool()` or `PetscOptionsGetScalarArray()`). This is a classic
13129b92b1d3SBarry Smith  "getter" function, which queries the command line options for a matching option name,
13139b92b1d3SBarry Smith  and returns the specified value.
13149b92b1d3SBarry Smith- `PetscOptionsXXX()` where `XXX` is some type or data structure (for example
13159b92b1d3SBarry Smith  `PetscOptionsBool()` or `PetscOptionsScalarArray()`). This is a so-called "provider"
13169b92b1d3SBarry Smith  function. It first records the option name in an internal list of previously encountered
13179b92b1d3SBarry Smith  options before calling `PetscOptionsGetXXX()` to query the status of said option.
13189b92b1d3SBarry Smith
13199b92b1d3SBarry SmithWhile users generally use the first option, developers will *always* use the second
13209b92b1d3SBarry Smith(provider) variant of functions. Thus, as the program runs, it will build up a list of
13219b92b1d3SBarry Smithencountered option names which are then printed **in the order of their appearance on the
13229b92b1d3SBarry Smithroot rank**. Different programs may take different paths through PETSc source code, so
13239b92b1d3SBarry Smiththey will encounter different providers, and therefore have different `-help` output.
13249b92b1d3SBarry Smith
13259b92b1d3SBarry Smith### PETSc has so many options for my program that it is hard to keep them straight
13269b92b1d3SBarry Smith
13279b92b1d3SBarry SmithRunning the PETSc program with the option `-help` will print out many of the options. To
13289b92b1d3SBarry Smithprint the options that have been specified within a program, employ `-options_left` to
13299b92b1d3SBarry Smithprint any options that the user specified but were not actually used by the program and
13309b92b1d3SBarry Smithall options used; this is helpful for detecting typo errors. The PETSc website has a search option,
13319b92b1d3SBarry Smithin the upper right hand corner, that quickly finds answers to most PETSc questions.
13329b92b1d3SBarry Smith
13339b92b1d3SBarry Smith### PETSc automatically handles many of the details in parallel PDE solvers. How can I understand what is really happening within my program?
13349b92b1d3SBarry Smith
13359b92b1d3SBarry SmithYou can use the option `-info` to get more details about the solution process. The
13369b92b1d3SBarry Smithoption `-log_view` provides details about the distribution of time spent in the various
13379b92b1d3SBarry Smithphases of the solution process. You can run with `-ts_view` or `-snes_view` or
13389b92b1d3SBarry Smith`-ksp_view` to see what solver options are being used. Run with `-ts_monitor`,
13399b92b1d3SBarry Smith`-snes_monitor`, or `-ksp_monitor` to watch convergence of the
13409b92b1d3SBarry Smithmethods. `-snes_converged_reason` and `-ksp_converged_reason` will indicate why and if
13419b92b1d3SBarry Smiththe solvers have converged.
13429b92b1d3SBarry Smith
13439b92b1d3SBarry Smith### Assembling large sparse matrices takes a long time. What can I do to make this process faster? Or MatSetValues() is so slow; what can I do to speed it up?
13449b92b1d3SBarry Smith
13459b92b1d3SBarry SmithYou probably need to do preallocation, as explained in {any}`sec_matsparse`.
13469b92b1d3SBarry SmithSee also the {ref}`performance chapter <ch_performance>` of the users manual.
13479b92b1d3SBarry Smith
13489b92b1d3SBarry SmithFor GPUs (and even CPUs) you can use `MatSetPreallocationCOO()` and `MatSetValuesCOO()` for more rapid assembly.
13499b92b1d3SBarry Smith
13509b92b1d3SBarry Smith### How can I generate performance summaries with PETSc?
13519b92b1d3SBarry Smith
13529b92b1d3SBarry SmithUse this option at runtime:
13539b92b1d3SBarry Smith
13549b92b1d3SBarry Smith-l
13559b92b1d3SBarry Smith
13569b92b1d3SBarry Smithog_view
13579b92b1d3SBarry Smith
13589b92b1d3SBarry SmithOutputs a comprehensive timing, memory consumption, and communications digest
13599b92b1d3SBarry Smithfor your program. See the {ref}`profiling chapter <ch_profiling>` of the users
13609b92b1d3SBarry Smithmanual for information on interpreting the summary data.
13619b92b1d3SBarry Smith
13629b92b1d3SBarry Smith### How do I know the amount of time spent on each level of the multigrid solver/preconditioner?
13639b92b1d3SBarry Smith
13649b92b1d3SBarry SmithRun with `-log_view` and `-pc_mg_log`
13659b92b1d3SBarry Smith
13669b92b1d3SBarry Smith### Where do I get the input matrices for the examples?
13679b92b1d3SBarry Smith
13689b92b1d3SBarry SmithSome examples use `$DATAFILESPATH/matrices/medium` and other files. These test matrices
13699b92b1d3SBarry Smithin PETSc binary format can be found in the [datafiles repository](https://gitlab.com/petsc/datafiles).
13709b92b1d3SBarry Smith
13719b92b1d3SBarry Smith### When I dump some matrices and vectors to binary, I seem to be generating some empty files with `.info` extensions. What's the deal with these?
13729b92b1d3SBarry Smith
13739b92b1d3SBarry SmithPETSc binary viewers put some additional information into `.info` files like matrix
13749b92b1d3SBarry Smithblock size. It is harmless but if you *really* don't like it you can use
13759b92b1d3SBarry Smith`-viewer_binary_skip_info` or `PetscViewerBinarySkipInfo()`.
13769b92b1d3SBarry Smith
13779b92b1d3SBarry Smith:::{note}
13789b92b1d3SBarry SmithYou need to call `PetscViewerBinarySkipInfo()` before
13799b92b1d3SBarry Smith`PetscViewerFileSetName()`. In other words you **cannot** use
13809b92b1d3SBarry Smith`PetscViewerBinaryOpen()` directly.
13819b92b1d3SBarry Smith:::
13829b92b1d3SBarry Smith
13839b92b1d3SBarry Smith### Why is my parallel solver slower than my sequential solver, or I have poor speed-up?
13849b92b1d3SBarry Smith
13859b92b1d3SBarry SmithThis can happen for many reasons:
13869b92b1d3SBarry Smith
13879b92b1d3SBarry Smith1. Make sure it is truly the time in `KSPSolve()` that is slower (by running the code
13889b92b1d3SBarry Smith   with `-log_view`). Often the slower time is in generating the matrix or some other
13899b92b1d3SBarry Smith   operation.
13909b92b1d3SBarry Smith2. There must be enough work for each process to outweigh the communication time. We
13919b92b1d3SBarry Smith   recommend an absolute minimum of about 10,000 unknowns per process, better is 20,000 or
13929b92b1d3SBarry Smith   more. This is even more true when using multiple GPUs, where you need to have millions
13939b92b1d3SBarry Smith   of unknowns per GPU.
13949b92b1d3SBarry Smith3. Make sure the {ref}`communication speed of the parallel computer
13959b92b1d3SBarry Smith   <doc_faq_general_parallel>` is good enough for parallel solvers.
13969b92b1d3SBarry Smith4. Check the number of solver iterates with the parallel solver against the sequential
13979b92b1d3SBarry Smith   solver. Most preconditioners require more iterations when used on more processes, this
13989b92b1d3SBarry Smith   is particularly true for block Jacobi (the default parallel preconditioner). You can
13999b92b1d3SBarry Smith   try `-pc_type asm` (`PCASM`) its iterations scale a bit better for more
14009b92b1d3SBarry Smith   processes. You may also consider multigrid preconditioners like `PCMG` or BoomerAMG
14019b92b1d3SBarry Smith   in `PCHYPRE`.
14029b92b1d3SBarry Smith
14039b92b1d3SBarry Smith(doc_faq_pipelined)=
14049b92b1d3SBarry Smith
14059b92b1d3SBarry Smith### What steps are necessary to make the pipelined solvers execute efficiently?
14069b92b1d3SBarry Smith
14079b92b1d3SBarry SmithPipelined solvers like `KSPPGMRES`, `KSPPIPECG`, `KSPPIPECR`, and `KSPGROPPCG` may
14089b92b1d3SBarry Smithrequire special MPI configuration to effectively overlap reductions with computation. In
14099b92b1d3SBarry Smithgeneral, this requires an MPI-3 implementation, an implementation that supports multiple
14109b92b1d3SBarry Smiththreads, and use of a "progress thread". Consult the documentation from your vendor or
14119b92b1d3SBarry Smithcomputing facility for more details.
14129b92b1d3SBarry Smith
14139b92b1d3SBarry Smith- Cray MPI MPT-5.6 MPI-3, by setting `$MPICH_MAX_THREAD_SAFETY` to "multiple"
14149b92b1d3SBarry Smith  for threads, plus either `$MPICH_ASYNC_PROGRESS` or
14159b92b1d3SBarry Smith  `$MPICH_NEMESIS_ASYNC_PROGRESS`. E.g.
14169b92b1d3SBarry Smith  ```console
14179b92b1d3SBarry Smith  $ export MPICH_MAX_THREAD_SAFETY=multiple
14189b92b1d3SBarry Smith  $ export MPICH_ASYNC_PROGRESS=1
14199b92b1d3SBarry Smith  $ export MPICH_NEMESIS_ASYNC_PROGRESS=1
14209b92b1d3SBarry Smith  ```
14219b92b1d3SBarry Smith
14229b92b1d3SBarry Smith- MPICH version 3.0 and later implements the MPI-3 standard and the default
14239b92b1d3SBarry Smith  configuration supports use of threads. Use of a progress thread is configured by
14249b92b1d3SBarry Smith  setting the environment variable `$MPICH_ASYNC_PROGRESS`. E.g.
14259b92b1d3SBarry Smith  ```console
14269b92b1d3SBarry Smith  $ export MPICH_ASYNC_PROGRESS=1
14279b92b1d3SBarry Smith  ```
14289b92b1d3SBarry Smith
14299b92b1d3SBarry Smith### When using PETSc in single precision mode (`--with-precision=single` when running `configure`) are the operations done in single or double precision?
14309b92b1d3SBarry Smith
14319b92b1d3SBarry SmithPETSc does **NOT** do any explicit conversion of single precision to double before
14329b92b1d3SBarry Smithperforming computations; it depends on the hardware and compiler for what happens. For
14339b92b1d3SBarry Smithexample, the compiler could choose to put the single precision numbers into the usual
14349b92b1d3SBarry Smithdouble precision registers and then use the usual double precision floating point unit. Or
14359b92b1d3SBarry Smithit could use SSE2 instructions that work directly on the single precision numbers. It is a
14369b92b1d3SBarry Smithbit of a mystery what decisions get made sometimes. There may be compiler flags in some
14379b92b1d3SBarry Smithcircumstances that can affect this.
14389b92b1d3SBarry Smith
14399b92b1d3SBarry Smith### Why is Newton's method (SNES) not converging, or converges slowly?
14409b92b1d3SBarry Smith
14419b92b1d3SBarry SmithNewton's method may not converge for many reasons, here are some of the most common:
14429b92b1d3SBarry Smith
14439b92b1d3SBarry Smith- The Jacobian is wrong (or correct in sequential but not in parallel).
14449b92b1d3SBarry Smith- The linear system is {ref}`not solved <doc_faq_execution_kspconv>` or is not solved
14459b92b1d3SBarry Smith  accurately enough.
14469b92b1d3SBarry Smith- The Jacobian system has a singularity that the linear solver is not handling.
14479b92b1d3SBarry Smith- There is a bug in the function evaluation routine.
14489b92b1d3SBarry Smith- The function is not continuous or does not have continuous first derivatives (e.g. phase
14499b92b1d3SBarry Smith  change or TVD limiters).
14509b92b1d3SBarry Smith- The equations may not have a solution (e.g. limit cycle instead of a steady state) or
14519b92b1d3SBarry Smith  there may be a "hill" between the initial guess and the steady state (e.g. reactants
14529b92b1d3SBarry Smith  must ignite and burn before reaching a steady state, but the steady-state residual will
14539b92b1d3SBarry Smith  be larger during combustion).
14549b92b1d3SBarry Smith
14559b92b1d3SBarry SmithHere are some of the ways to help debug lack of convergence of Newton:
14569b92b1d3SBarry Smith
14579b92b1d3SBarry Smith- Run on one processor to see if the problem is only in parallel.
14589b92b1d3SBarry Smith
14599b92b1d3SBarry Smith- Run with `-info` to get more detailed information on the solution process.
14609b92b1d3SBarry Smith
14619b92b1d3SBarry Smith- Run with the options
14629b92b1d3SBarry Smith
14639b92b1d3SBarry Smith  ```text
14649b92b1d3SBarry Smith  -snes_monitor -ksp_monitor_true_residual -snes_converged_reason -ksp_converged_reason
14659b92b1d3SBarry Smith  ```
14669b92b1d3SBarry Smith
14679b92b1d3SBarry Smith  - If the linear solve does not converge, check if the Jacobian is correct, then see
14689b92b1d3SBarry Smith    {ref}`this question <doc_faq_execution_kspconv>`.
14699b92b1d3SBarry Smith  - If the preconditioned residual converges, but the true residual does not, the
14709b92b1d3SBarry Smith    preconditioner may be singular.
14719b92b1d3SBarry Smith  - If the linear solve converges well, but the line search fails, the Jacobian may be
14729b92b1d3SBarry Smith    incorrect.
14739b92b1d3SBarry Smith
14749b92b1d3SBarry Smith- Run with `-pc_type lu` or `-pc_type svd` to see if the problem is a poor linear
14759b92b1d3SBarry Smith  solver.
14769b92b1d3SBarry Smith
14779b92b1d3SBarry Smith- Run with `-mat_view` or `-mat_view draw` to see if the Jacobian looks reasonable.
14789b92b1d3SBarry Smith
14799b92b1d3SBarry Smith- Run with `-snes_test_jacobian -snes_test_jacobian_view` to see if the Jacobian you are
14809b92b1d3SBarry Smith  using is wrong. Compare the output when you add `-mat_fd_type ds` to see if the result
14819b92b1d3SBarry Smith  is sensitive to the choice of differencing parameter.
14829b92b1d3SBarry Smith
14839b92b1d3SBarry Smith- Run with `-snes_mf_operator -pc_type lu` to see if the Jacobian you are using is
14849b92b1d3SBarry Smith  wrong. If the problem is too large for a direct solve, try
14859b92b1d3SBarry Smith
14869b92b1d3SBarry Smith  ```text
14879b92b1d3SBarry Smith  -snes_mf_operator -pc_type ksp -ksp_ksp_rtol 1e-12.
14889b92b1d3SBarry Smith  ```
14899b92b1d3SBarry Smith
14909b92b1d3SBarry Smith  Compare the output when you add `-mat_mffd_type ds` to see if the result is sensitive
14919b92b1d3SBarry Smith  to choice of differencing parameter.
14929b92b1d3SBarry Smith
14939b92b1d3SBarry Smith- Run with `-snes_linesearch_monitor` to see if the line search is failing (this is
14949b92b1d3SBarry Smith  usually a sign of a bad Jacobian). Use `-info` in PETSc 3.1 and older versions,
14959b92b1d3SBarry Smith  `-snes_ls_monitor` in PETSc 3.2 and `-snes_linesearch_monitor` in PETSc 3.3 and
14969b92b1d3SBarry Smith  later.
14979b92b1d3SBarry Smith
14989b92b1d3SBarry SmithHere are some ways to help the Newton process if everything above checks out:
14999b92b1d3SBarry Smith
15009b92b1d3SBarry Smith- Run with grid sequencing (`-snes_grid_sequence` if working with a `DM` is all you
15019b92b1d3SBarry Smith  need) to generate better initial guess on your finer mesh.
15029b92b1d3SBarry Smith
15039b92b1d3SBarry Smith- Run with quad precision, i.e.
15049b92b1d3SBarry Smith
15059b92b1d3SBarry Smith  ```console
15069b92b1d3SBarry Smith  $ ./configure --with-precision=__float128 --download-f2cblaslapack
15079b92b1d3SBarry Smith  ```
15089b92b1d3SBarry Smith
15099b92b1d3SBarry Smith  :::{note}
15109b92b1d3SBarry Smith  quad precision requires PETSc 3.2 and later and recent versions of the GNU compilers.
15119b92b1d3SBarry Smith  :::
15129b92b1d3SBarry Smith
15139b92b1d3SBarry Smith- Change the units (nondimensionalization), boundary condition scaling, or formulation so
15149b92b1d3SBarry Smith  that the Jacobian is better conditioned. See [Buckingham pi theorem](https://en.wikipedia.org/wiki/Buckingham_%CF%80_theorem) and [Dimensional and
15159b92b1d3SBarry Smith  Scaling Analysis](https://epubs.siam.org/doi/pdf/10.1137/16M1107127).
15169b92b1d3SBarry Smith
15179b92b1d3SBarry Smith- Mollify features in the function that do not have continuous first derivatives (often
15189b92b1d3SBarry Smith  occurs when there are "if" statements in the residual evaluation, e.g. phase change or
15199b92b1d3SBarry Smith  TVD limiters). Use a variational inequality solver (`SNESVINEWTONRSLS`) if the
15209b92b1d3SBarry Smith  discontinuities are of fundamental importance.
15219b92b1d3SBarry Smith
15229b92b1d3SBarry Smith- Try a trust region method (`-ts_type tr`, may have to adjust parameters).
15239b92b1d3SBarry Smith
15249b92b1d3SBarry Smith- Run with some continuation parameter from a point where you know the solution, see
15259b92b1d3SBarry Smith  `TSPSEUDO` for steady-states.
15269b92b1d3SBarry Smith
15279b92b1d3SBarry Smith- There are homotopy solver packages like PHCpack that can get you all possible solutions
15289b92b1d3SBarry Smith  (and tell you that it has found them all) but those are not scalable and cannot solve
15299b92b1d3SBarry Smith  anything but small problems.
15309b92b1d3SBarry Smith
15319b92b1d3SBarry Smith(doc_faq_execution_kspconv)=
15329b92b1d3SBarry Smith
15339b92b1d3SBarry Smith### Why is the linear solver (KSP) not converging, or converges slowly?
15349b92b1d3SBarry Smith
15359b92b1d3SBarry Smith:::{tip}
15369b92b1d3SBarry SmithAlways run with `-ksp_converged_reason -ksp_monitor_true_residual` when trying to
15379b92b1d3SBarry Smithlearn why a method is not converging!
15389b92b1d3SBarry Smith:::
15399b92b1d3SBarry Smith
15409b92b1d3SBarry SmithCommon reasons for KSP not converging are:
15419b92b1d3SBarry Smith
15429b92b1d3SBarry Smith- A symmetric method is being used for a non-symmetric problem.
15439b92b1d3SBarry Smith
15449b92b1d3SBarry Smith- The equations are singular by accident (e.g. forgot to impose boundary
15459b92b1d3SBarry Smith  conditions). Check this for a small problem using `-pc_type svd -pc_svd_monitor`.
15469b92b1d3SBarry Smith
15479b92b1d3SBarry Smith- The equations are intentionally singular (e.g. constant null space), but the Krylov
15489b92b1d3SBarry Smith  method was not informed, see `MatSetNullSpace()`. Always inform your local Krylov
15499b92b1d3SBarry Smith  subspace solver of any change of singularity. Failure to do so will result in the
15509b92b1d3SBarry Smith  immediate revocation of your computing and keyboard operator licenses, as well as
15519b92b1d3SBarry Smith  a stern talking-to by the nearest Krylov Subspace Method representative.
15529b92b1d3SBarry Smith
15539b92b1d3SBarry Smith- The equations are intentionally singular and `MatSetNullSpace()` was used, but the
15549b92b1d3SBarry Smith  right-hand side is not consistent. You may have to call `MatNullSpaceRemove()` on the
15559b92b1d3SBarry Smith  right-hand side before calling `KSPSolve()`. See `MatSetTransposeNullSpace()`.
15569b92b1d3SBarry Smith
15579b92b1d3SBarry Smith- The equations are indefinite so that standard preconditioners don't work. Usually you
15589b92b1d3SBarry Smith  will know this from the physics, but you can check with
15599b92b1d3SBarry Smith
15609b92b1d3SBarry Smith  ```text
15619b92b1d3SBarry Smith  -ksp_compute_eigenvalues -ksp_gmres_restart 1000 -pc_type none
15629b92b1d3SBarry Smith  ```
15639b92b1d3SBarry Smith
15649b92b1d3SBarry Smith  For simple saddle point problems, try
15659b92b1d3SBarry Smith
15669b92b1d3SBarry Smith  ```text
15679b92b1d3SBarry Smith  -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_detect_saddle_point
15689b92b1d3SBarry Smith  ```
15699b92b1d3SBarry Smith
15709b92b1d3SBarry Smith  For more difficult problems, read the literature to find robust methods and ask
15719b92b1d3SBarry Smith  <mailto:petsc-users@mcs.anl.gov> or <mailto:petsc-maint@mcs.anl.gov> if you want advice about how to
15729b92b1d3SBarry Smith  implement them.
15739b92b1d3SBarry Smith
15749b92b1d3SBarry Smith- If the method converges in preconditioned residual, but not in true residual, the
15759b92b1d3SBarry Smith  preconditioner is likely singular or nearly so. This is common for saddle point problems
15769b92b1d3SBarry Smith  (e.g. incompressible flow) or strongly nonsymmetric operators (e.g. low-Mach hyperbolic
15779b92b1d3SBarry Smith  problems with large time steps).
15789b92b1d3SBarry Smith
15799b92b1d3SBarry Smith- The preconditioner is too weak or is unstable. See if `-pc_type asm -sub_pc_type lu`
15809b92b1d3SBarry Smith  improves the convergence rate. If GMRES is losing too much progress in the restart, see
15819b92b1d3SBarry Smith  if longer restarts help `-ksp_gmres_restart 300`. If a transpose is available, try
15829b92b1d3SBarry Smith  `-ksp_type bcgs` or other methods that do not require a restart.
15839b92b1d3SBarry Smith
15849b92b1d3SBarry Smith  :::{note}
15859b92b1d3SBarry Smith  Unfortunately convergence with these methods is frequently erratic.
15869b92b1d3SBarry Smith  :::
15879b92b1d3SBarry Smith
15889b92b1d3SBarry Smith- The preconditioner is nonlinear (e.g. a nested iterative solve), try `-ksp_type
15899b92b1d3SBarry Smith  fgmres` or `-ksp_type gcr`.
15909b92b1d3SBarry Smith
15919b92b1d3SBarry Smith- You are using geometric multigrid, but some equations (often boundary conditions) are
15929b92b1d3SBarry Smith  not scaled compatibly between levels. Try `-pc_mg_galerkin` both to algebraically
15939b92b1d3SBarry Smith  construct a correctly scaled coarse operator or make sure that all the equations are
15949b92b1d3SBarry Smith  scaled in the same way if you want to use rediscretized coarse levels.
15959b92b1d3SBarry Smith
15969b92b1d3SBarry Smith- The matrix is very ill-conditioned. Check the {ref}`condition number <doc_faq_usage_condnum>`.
15979b92b1d3SBarry Smith
15989b92b1d3SBarry Smith  - Try to improve it by choosing the relative scaling of components/boundary conditions.
15999b92b1d3SBarry Smith  - Try `-ksp_diagonal_scale -ksp_diagonal_scale_fix`.
16009b92b1d3SBarry Smith  - Perhaps change the formulation of the problem to produce more friendly algebraic
16019b92b1d3SBarry Smith    equations.
16029b92b1d3SBarry Smith
16039b92b1d3SBarry Smith- Change the units (nondimensionalization), boundary condition scaling, or formulation so
16049b92b1d3SBarry Smith  that the Jacobian is better conditioned. See [Buckingham pi theorem](https://en.wikipedia.org/wiki/Buckingham_%CF%80_theorem) and [Dimensional and
16059b92b1d3SBarry Smith  Scaling Analysis](https://epubs.siam.org/doi/pdf/10.1137/16M1107127).
16069b92b1d3SBarry Smith
16079b92b1d3SBarry Smith- Classical Gram-Schmidt is becoming unstable, try `-ksp_gmres_modifiedgramschmidt` or
16089b92b1d3SBarry Smith  use a method that orthogonalizes differently, e.g. `-ksp_type gcr`.
16099b92b1d3SBarry Smith
16109b92b1d3SBarry Smith### I get the error message: Actual argument at (1) to assumed-type dummy is of derived type with type-bound or FINAL procedures
16119b92b1d3SBarry Smith
16129b92b1d3SBarry SmithUse the following code-snippet:
16139b92b1d3SBarry Smith
16149b92b1d3SBarry Smith```fortran
16159b92b1d3SBarry Smithmodule context_module
16169b92b1d3SBarry Smith#include petsc/finclude/petsc.h
16179b92b1d3SBarry Smithuse petsc
16189b92b1d3SBarry Smithimplicit none
16199b92b1d3SBarry Smithprivate
16209b92b1d3SBarry Smithtype, public ::  context_type
16219b92b1d3SBarry Smith  private
16229b92b1d3SBarry Smith  PetscInt :: foo
16239b92b1d3SBarry Smithcontains
16249b92b1d3SBarry Smith  procedure, public :: init => context_init
16259b92b1d3SBarry Smithend type context_type
16269b92b1d3SBarry Smithcontains
16279b92b1d3SBarry Smithsubroutine context_init(self, foo)
16289b92b1d3SBarry Smith  class(context_type), intent(in out) :: self
16299b92b1d3SBarry Smith  PetscInt, intent(in) :: foo
16309b92b1d3SBarry Smith  self%foo = foo
16319b92b1d3SBarry Smithend subroutine context_init
16329b92b1d3SBarry Smithend module context_module
16339b92b1d3SBarry Smith
16349b92b1d3SBarry Smith!------------------------------------------------------------------------
16359b92b1d3SBarry Smith
16369b92b1d3SBarry Smithprogram test_snes
16379b92b1d3SBarry Smithuse,intrinsic :: iso_c_binding
16389b92b1d3SBarry Smithuse petsc
16399b92b1d3SBarry Smithuse context_module
16409b92b1d3SBarry Smithimplicit none
16419b92b1d3SBarry Smith
16429b92b1d3SBarry SmithSNES :: snes
16439b92b1d3SBarry Smithtype(context_type),target :: context
16449b92b1d3SBarry Smithtype(c_ptr) :: contextOut
16459b92b1d3SBarry SmithPetscErrorCode :: ierr
16469b92b1d3SBarry Smith
16479b92b1d3SBarry Smithcall PetscInitialize(PETSC_NULL_CHARACTER, ierr)
16489b92b1d3SBarry Smithcall SNESCreate(PETSC_COMM_WORLD, snes, ierr)
16499b92b1d3SBarry Smithcall context%init(1)
16509b92b1d3SBarry Smith
16519b92b1d3SBarry SmithcontextOut = c_loc(context) ! contextOut is a C pointer on context
16529b92b1d3SBarry Smith
16539b92b1d3SBarry Smithcall SNESSetConvergenceTest(snes, convergence, contextOut, PETSC_NULL_FUNCTION, ierr)
16549b92b1d3SBarry Smith
16559b92b1d3SBarry Smithcall SNESDestroy(snes, ierr)
16569b92b1d3SBarry Smithcall PetscFinalize(ierr)
16579b92b1d3SBarry Smith
16589b92b1d3SBarry Smithcontains
16599b92b1d3SBarry Smith
16609b92b1d3SBarry Smithsubroutine convergence(snes, num_iterations, xnorm, pnorm,fnorm, reason, contextIn, ierr)
16619b92b1d3SBarry SmithSNES, intent(in) :: snes
16629b92b1d3SBarry Smith
16639b92b1d3SBarry SmithPetscInt, intent(in) :: num_iterations
16649b92b1d3SBarry SmithPetscReal, intent(in) :: xnorm, pnorm, fnorm
16659b92b1d3SBarry SmithSNESConvergedReason, intent(out) :: reason
16669b92b1d3SBarry Smithtype(c_ptr), intent(in out) :: contextIn
16679b92b1d3SBarry Smithtype(context_type), pointer :: context
16689b92b1d3SBarry SmithPetscErrorCode, intent(out) :: ierr
16699b92b1d3SBarry Smith
16709b92b1d3SBarry Smithcall c_f_pointer(contextIn,context)  ! convert the C pointer to a Fortran pointer to use context as in the main program
16719b92b1d3SBarry Smithreason = 0
16729b92b1d3SBarry Smithierr = 0
16739b92b1d3SBarry Smithend subroutine convergence
16749b92b1d3SBarry Smithend program test_snes
16759b92b1d3SBarry Smith```
16769b92b1d3SBarry Smith
16779b92b1d3SBarry Smith### In C++ I get a crash on `VecDestroy()` (or some other PETSc object) at the end of the program
16789b92b1d3SBarry Smith
16799b92b1d3SBarry SmithThis can happen when the destructor for a C++ class is automatically called at the end of
16809b92b1d3SBarry Smiththe program after `PetscFinalize()`. Use the following code-snippet:
16819b92b1d3SBarry Smith
16829b92b1d3SBarry Smith```
16839b92b1d3SBarry Smithmain()
16849b92b1d3SBarry Smith{
16859b92b1d3SBarry Smith  PetscCall(PetscInitialize());
16869b92b1d3SBarry Smith  {
16879b92b1d3SBarry Smith    your variables
16889b92b1d3SBarry Smith    your code
16899b92b1d3SBarry Smith
16909b92b1d3SBarry Smith    ...   /* all your destructors are called here automatically by C++ so they work correctly */
16919b92b1d3SBarry Smith  }
16929b92b1d3SBarry Smith  PetscCall(PetscFinalize());
16939b92b1d3SBarry Smith  return 0
16949b92b1d3SBarry Smith}
16959b92b1d3SBarry Smith```
16969b92b1d3SBarry Smith
16979b92b1d3SBarry Smith______________________________________________________________________
16989b92b1d3SBarry Smith
16999b92b1d3SBarry Smith## Debugging
17009b92b1d3SBarry Smith
17019b92b1d3SBarry Smith### What does the message hwloc/linux: Ignoring PCI device with non-16bit domain mean?
17029b92b1d3SBarry Smith
17039b92b1d3SBarry SmithThis is printed by the hwloc library that is used by some MPI implementations. It can be ignored.
17049b92b1d3SBarry SmithTo prevent the message from always being printed set the environmental variable `HWLOC_HIDE_ERRORS` to 2.
17059b92b1d3SBarry SmithFor example
17069b92b1d3SBarry Smith
17079b92b1d3SBarry Smith```
17089b92b1d3SBarry Smithexport HWLOC_HIDE_ERRORS=2
17099b92b1d3SBarry Smith```
17109b92b1d3SBarry Smith
17119b92b1d3SBarry Smithwhich can be added to your login profile file such as `~/.bashrc`.
17129b92b1d3SBarry Smith
17139b92b1d3SBarry Smith### How do I turn off PETSc signal handling so I can use the `-C` Option On `xlf`?
17149b92b1d3SBarry Smith
17159b92b1d3SBarry SmithImmediately after calling `PetscInitialize()` call `PetscPopSignalHandler()`.
17169b92b1d3SBarry Smith
17179b92b1d3SBarry SmithSome Fortran compilers including the IBM xlf, xlF etc compilers have a compile option
17189b92b1d3SBarry Smith(`-C` for IBM's) that causes all array access in Fortran to be checked that they are
17199b92b1d3SBarry Smithin-bounds. This is a great feature but does require that the array dimensions be set
17209b92b1d3SBarry Smithexplicitly, not with a \*.
17219b92b1d3SBarry Smith
17229b92b1d3SBarry Smith### How do I debug if `-start_in_debugger` does not work on my machine?
17239b92b1d3SBarry Smith
17249b92b1d3SBarry SmithThe script <https://github.com/Azrael3000/tmpi> can be used to run multiple MPI
17259b92b1d3SBarry Smithranks in the debugger using tmux.
17269b92b1d3SBarry Smith
17279b92b1d3SBarry SmithOn newer macOS machines - one has to be in admin group to be able to use debugger.
17289b92b1d3SBarry Smith
17299b92b1d3SBarry SmithOn newer Ubuntu linux machines - one has to disable `ptrace_scope` with
17309b92b1d3SBarry Smith
17319b92b1d3SBarry Smith```console
17329b92b1d3SBarry Smith$ sudo echo 0 > /proc/sys/kernel/yama/ptrace_scope
17339b92b1d3SBarry Smith```
17349b92b1d3SBarry Smith
17359b92b1d3SBarry Smithto get start in debugger working.
17369b92b1d3SBarry Smith
17379b92b1d3SBarry SmithIf `-start_in_debugger` does not work on your OS, for a uniprocessor job, just
17389b92b1d3SBarry Smithtry the debugger directly, for example: `gdb ex1`. You can also use [TotalView](https://totalview.io/products/totalview) which is a good graphical parallel debugger.
17399b92b1d3SBarry Smith
17409b92b1d3SBarry Smith### How do I see where my code is hanging?
17419b92b1d3SBarry Smith
17429b92b1d3SBarry SmithYou can use the `-start_in_debugger` option to start all processes in the debugger (each
17439b92b1d3SBarry Smithwill come up in its own xterm) or run in [TotalView](https://totalview.io/products/totalview). Then use `cont` (for continue) in each
17449b92b1d3SBarry Smithxterm. Once you are sure that the program is hanging, hit control-c in each xterm and then
17459b92b1d3SBarry Smithuse 'where' to print a stack trace for each process.
17469b92b1d3SBarry Smith
17479b92b1d3SBarry Smith### How can I inspect PETSc vector and matrix values when in the debugger?
17489b92b1d3SBarry Smith
17499b92b1d3SBarry SmithI will illustrate this with `gdb`, but it should be similar on other debuggers. You can
17509b92b1d3SBarry Smithlook at local `Vec` values directly by obtaining the array. For a `Vec` v, we can
17519b92b1d3SBarry Smithprint all local values using:
17529b92b1d3SBarry Smith
17539b92b1d3SBarry Smith```console
17549b92b1d3SBarry Smith(gdb) p ((Vec_Seq*) v->data)->array[0]@v->map.n
17559b92b1d3SBarry Smith```
17569b92b1d3SBarry Smith
17579b92b1d3SBarry SmithHowever, this becomes much more complicated for a matrix. Therefore, it is advisable to use the default viewer to look at the object. For a `Vec` v and a `Mat` m, this would be:
17589b92b1d3SBarry Smith
17599b92b1d3SBarry Smith```console
17609b92b1d3SBarry Smith(gdb) call VecView(v, 0)
17619b92b1d3SBarry Smith(gdb) call MatView(m, 0)
17629b92b1d3SBarry Smith```
17639b92b1d3SBarry Smith
17649b92b1d3SBarry Smithor with a communicator other than `MPI_COMM_WORLD`:
17659b92b1d3SBarry Smith
17669b92b1d3SBarry Smith```console
17679b92b1d3SBarry Smith(gdb) call MatView(m, PETSC_VIEWER_STDOUT_(m->comm))
17689b92b1d3SBarry Smith```
17699b92b1d3SBarry Smith
17709b92b1d3SBarry SmithTotalview 8.8.0+ has a new feature that allows libraries to provide their own code to
17719b92b1d3SBarry Smithdisplay objects in the debugger. Thus in theory each PETSc object, `Vec`, `Mat` etc
17729b92b1d3SBarry Smithcould have custom code to print values in the object. We have only done this for the most
17739b92b1d3SBarry Smithelementary display of `Vec` and `Mat`. See the routine `TV_display_type()` in
17749b92b1d3SBarry Smith`src/vec/vec/interface/vector.c` for an example of how these may be written. Contact us
17759b92b1d3SBarry Smithif you would like to add more.
17769b92b1d3SBarry Smith
17779b92b1d3SBarry Smith### How can I find the cause of floating point exceptions like not-a-number (NaN) or infinity?
17789b92b1d3SBarry Smith
17799b92b1d3SBarry SmithThe best way to locate floating point exceptions is to use a debugger. On supported
17809b92b1d3SBarry Smitharchitectures (including Linux and glibc-based systems), just run in a debugger and pass
17819b92b1d3SBarry Smith`-fp_trap` to the PETSc application. This will activate signaling exceptions and the
17829b92b1d3SBarry Smithdebugger will break on the line that first divides by zero or otherwise generates an
17839b92b1d3SBarry Smithexceptions.
17849b92b1d3SBarry Smith
17859b92b1d3SBarry SmithWithout a debugger, running with `-fp_trap` in debug mode will only identify the
17869b92b1d3SBarry Smithfunction in which the error occurred, but not the line or the type of exception. If
17879b92b1d3SBarry Smith`-fp_trap` is not supported on your architecture, consult the documentation for your
17889b92b1d3SBarry Smithdebugger since there is likely a way to have it catch exceptions.
17899b92b1d3SBarry Smith
17909b92b1d3SBarry Smith(error_libimf)=
17919b92b1d3SBarry Smith
17929b92b1d3SBarry Smith### Error while loading shared libraries: libimf.so: cannot open shared object file: No such file or directory
17939b92b1d3SBarry Smith
17949b92b1d3SBarry SmithThe Intel compilers use shared libraries (like libimf) that cannot be found, by default, at run
17959b92b1d3SBarry Smithtime. When using the Intel compilers (and running the resulting code) you must make sure
17969b92b1d3SBarry Smiththat the proper Intel initialization scripts are run. This is usually done by adding some
17979b92b1d3SBarry Smithcode into your `.cshrc`, `.bashrc`, `.profile` etc file. Sometimes on batch file
17989b92b1d3SBarry Smithsystems that do now access your initialization files (like .cshrc) you must include the
17999b92b1d3SBarry Smithinitialization calls in your batch file submission.
18009b92b1d3SBarry Smith
18019b92b1d3SBarry SmithFor example, on my Mac using `csh` I have the following in my `.cshrc` file:
18029b92b1d3SBarry Smith
18039b92b1d3SBarry Smith```csh
18049b92b1d3SBarry Smithsource /opt/intel/cc/10.1.012/bin/iccvars.csh
18059b92b1d3SBarry Smithsource /opt/intel/fc/10.1.012/bin/ifortvars.csh
18069b92b1d3SBarry Smithsource /opt/intel/idb/10.1.012/bin/idbvars.csh
18079b92b1d3SBarry Smith```
18089b92b1d3SBarry Smith
18099b92b1d3SBarry SmithAnd in my `.profile` I have
18109b92b1d3SBarry Smith
18119b92b1d3SBarry Smith```csh
18129b92b1d3SBarry Smithsource /opt/intel/cc/10.1.012/bin/iccvars.sh
18139b92b1d3SBarry Smithsource /opt/intel/fc/10.1.012/bin/ifortvars.sh
18149b92b1d3SBarry Smithsource /opt/intel/idb/10.1.012/bin/idbvars.sh
18159b92b1d3SBarry Smith```
18169b92b1d3SBarry Smith
18179b92b1d3SBarry Smith(object_type_not_set)=
18189b92b1d3SBarry Smith
18199b92b1d3SBarry Smith### What does "Object Type Not Set: Argument # N" Mean?
18209b92b1d3SBarry Smith
18219b92b1d3SBarry SmithMany operations on PETSc objects require that the specific type of the object be set before the operations is performed. You must call `XXXSetType()` or `XXXSetFromOptions()` before you make the offending call. For example
18229b92b1d3SBarry Smith
18239b92b1d3SBarry Smith```
18249b92b1d3SBarry SmithMat A;
18259b92b1d3SBarry Smith
18269b92b1d3SBarry SmithPetscCall(MatCreate(PETSC_COMM_WORLD, &A));
18279b92b1d3SBarry SmithPetscCall(MatSetValues(A,....));
18289b92b1d3SBarry Smith```
18299b92b1d3SBarry Smith
18309b92b1d3SBarry Smithwill not work. You must add `MatSetType()` or `MatSetFromOptions()` before the call to `MatSetValues()`. I.e.
18319b92b1d3SBarry Smith
18329b92b1d3SBarry Smith```
18339b92b1d3SBarry SmithMat A;
18349b92b1d3SBarry Smith
18359b92b1d3SBarry SmithPetscCall(MatCreate(PETSC_COMM_WORLD, &A));
18369b92b1d3SBarry Smith
18379b92b1d3SBarry SmithPetscCall(MatSetType(A, MATAIJ));
18389b92b1d3SBarry Smith/* Will override MatSetType() */
18399b92b1d3SBarry SmithPetscCall(MatSetFromOptions());
18409b92b1d3SBarry Smith
18419b92b1d3SBarry SmithPetscCall(MatSetValues(A,....));
18429b92b1d3SBarry Smith```
18439b92b1d3SBarry Smith
18449b92b1d3SBarry Smith(split_ownership)=
18459b92b1d3SBarry Smith
18469b92b1d3SBarry Smith### What does error detected in `PetscSplitOwnership()` about "sum of local lengths ...": mean?
18479b92b1d3SBarry Smith
18489b92b1d3SBarry SmithIn a previous call to `VecSetSizes()`, `MatSetSizes()`, `VecCreateXXX()` or
18499b92b1d3SBarry Smith`MatCreateXXX()` you passed in local and global sizes that do not make sense for the
18509b92b1d3SBarry Smithcorrect number of processors. For example if you pass in a local size of 2 and a global
18519b92b1d3SBarry Smithsize of 100 and run on two processors, this cannot work since the sum of the local sizes
18529b92b1d3SBarry Smithis 4, not 100.
18539b92b1d3SBarry Smith
18549b92b1d3SBarry Smith(valgrind)=
18559b92b1d3SBarry Smith
1856*74df5e01SJunchao Zhang### What does "corrupt argument" or "caught signal" Or "SEGV" Or "segmentation violation" Or "bus error" mean? Can I use Valgrind or CUDA Compute Sanitizer to debug memory corruption issues?
18579b92b1d3SBarry Smith
18589b92b1d3SBarry SmithSometimes it can mean an argument to a function is invalid. In Fortran this may be caused
18599b92b1d3SBarry Smithby forgetting to list an argument in the call, especially the final `PetscErrorCode`.
18609b92b1d3SBarry Smith
18619b92b1d3SBarry SmithOtherwise it is usually caused by memory corruption; that is somewhere the code is writing
18629b92b1d3SBarry Smithout of array bounds. To track this down rerun the debug version of the code with the
18639b92b1d3SBarry Smithoption `-malloc_debug`. Occasionally the code may crash only with the optimized version,
18649b92b1d3SBarry Smithin that case run the optimized version with `-malloc_debug`. If you determine the
18659b92b1d3SBarry Smithproblem is from memory corruption you can put the macro CHKMEMQ in the code near the crash
18669b92b1d3SBarry Smithto determine exactly what line is causing the problem.
18679b92b1d3SBarry Smith
1868*74df5e01SJunchao ZhangIf `-malloc_debug` does not help: on NVIDIA CUDA systems you can use <https://docs.nvidia.com/compute-sanitizer/ComputeSanitizer/index.html>,
1869*74df5e01SJunchao Zhangfor example, `compute-sanitizer --tool memcheck [sanitizer_options] app_name [app_options]`.
18709b92b1d3SBarry Smith
18719b92b1d3SBarry SmithIf `-malloc_debug` does not help: on GNU/Linux (not macOS machines) - you can
18729b92b1d3SBarry Smithuse [valgrind](http://valgrind.org). Follow the below instructions:
18739b92b1d3SBarry Smith
18749b92b1d3SBarry Smith1. `configure` PETSc with `--download-mpich --with-debugging` (you can use other MPI implementations but most produce spurious Valgrind messages)
18759b92b1d3SBarry Smith
18769b92b1d3SBarry Smith2. Compile your application code with this build of PETSc.
18779b92b1d3SBarry Smith
18789b92b1d3SBarry Smith3. Run with Valgrind.
18799b92b1d3SBarry Smith
18809b92b1d3SBarry Smith   ```console
18819b92b1d3SBarry Smith   $ $PETSC_DIR/lib/petsc/bin/petscmpiexec -valgrind -n NPROC PETSCPROGRAMNAME PROGRAMOPTIONS
18829b92b1d3SBarry Smith   ```
18839b92b1d3SBarry Smith
18849b92b1d3SBarry Smith   or
18859b92b1d3SBarry Smith
18869b92b1d3SBarry Smith   ```console
18879b92b1d3SBarry Smith   $ mpiexec -n NPROC valgrind --tool=memcheck -q --num-callers=20 \
18889b92b1d3SBarry Smith   --suppressions=$PETSC_DIR/share/petsc/suppressions/valgrind \
18899b92b1d3SBarry Smith   --log-file=valgrind.log.%p PETSCPROGRAMNAME -malloc off PROGRAMOPTIONS
18909b92b1d3SBarry Smith   ```
18919b92b1d3SBarry Smith
18929b92b1d3SBarry Smith:::{note}
18939b92b1d3SBarry Smith- option `--with-debugging` enables valgrind to give stack trace with additional
18949b92b1d3SBarry Smith  source-file:line-number info.
18959b92b1d3SBarry Smith- option `--download-mpich` is valgrind clean, other MPI builds are not valgrind clean.
18969b92b1d3SBarry Smith- when `--download-mpich` is used - mpiexec will be in `$PETSC_ARCH/bin`
18979b92b1d3SBarry Smith- `--log-file=valgrind.log.%p` option tells valgrind to store the output from each
18989b92b1d3SBarry Smith  process in a different file \[as %p i.e PID, is different for each MPI process.
18999b92b1d3SBarry Smith- `memcheck` will not find certain array access that violate static array
19009b92b1d3SBarry Smith  declarations so if memcheck runs clean you can try the `--tool=exp-ptrcheck`
19019b92b1d3SBarry Smith  instead.
19029b92b1d3SBarry Smith:::
19039b92b1d3SBarry Smith
19049b92b1d3SBarry SmithYou might also consider using <http://drmemory.org> which has support for GNU/Linux, Apple
19059b92b1d3SBarry SmithMac OS and Microsoft Windows machines. (Note we haven't tried this ourselves).
19069b92b1d3SBarry Smith
19079b92b1d3SBarry Smith(zeropivot)=
19089b92b1d3SBarry Smith
19099b92b1d3SBarry Smith### What does "detected zero pivot in LU factorization" mean?
19109b92b1d3SBarry Smith
19119b92b1d3SBarry SmithA zero pivot in LU, ILU, Cholesky, or ICC sparse factorization does not always mean that
19129b92b1d3SBarry Smiththe matrix is singular. You can use
19139b92b1d3SBarry Smith
19149b92b1d3SBarry Smith```text
19159b92b1d3SBarry Smith-pc_factor_shift_type nonzero -pc_factor_shift_amount [amount]
19169b92b1d3SBarry Smith```
19179b92b1d3SBarry Smith
19189b92b1d3SBarry Smithor
19199b92b1d3SBarry Smith
19209b92b1d3SBarry Smith```text
19219b92b1d3SBarry Smith-pc_factor_shift_type positive_definite -[level]_pc_factor_shift_type nonzero
19229b92b1d3SBarry Smith -pc_factor_shift_amount [amount]
19239b92b1d3SBarry Smith```
19249b92b1d3SBarry Smith
19259b92b1d3SBarry Smithor
19269b92b1d3SBarry Smith
19279b92b1d3SBarry Smith```text
19289b92b1d3SBarry Smith-[level]_pc_factor_shift_type positive_definite
19299b92b1d3SBarry Smith```
19309b92b1d3SBarry Smith
19319b92b1d3SBarry Smithto prevent the zero pivot. [level] is "sub" when lu, ilu, cholesky, or icc are employed in
19329b92b1d3SBarry Smitheach individual block of the bjacobi or ASM preconditioner. [level] is "mg_levels" or
19339b92b1d3SBarry Smith"mg_coarse" when lu, ilu, cholesky, or icc are used inside multigrid smoothers or to the
19349b92b1d3SBarry Smithcoarse grid solver. See `PCFactorSetShiftType()`, `PCFactorSetShiftAmount()`.
19359b92b1d3SBarry Smith
19369b92b1d3SBarry SmithThis error can also happen if your matrix is singular, see `MatSetNullSpace()` for how
19379b92b1d3SBarry Smithto handle this. If this error occurs in the zeroth row of the matrix, it is likely you
19389b92b1d3SBarry Smithhave an error in the code that generates the matrix.
19399b92b1d3SBarry Smith
19409b92b1d3SBarry Smith### You create draw windows or `PETSCVIEWERDRAW` windows or use options `-ksp_monitor draw::draw_lg` or `-snes_monitor draw::draw_lg` and the program seems to run OK but windows never open
19419b92b1d3SBarry Smith
19429b92b1d3SBarry SmithThe libraries were compiled without support for X windows. Make sure that `configure`
19439b92b1d3SBarry Smithwas run with the option `--with-x`.
19449b92b1d3SBarry Smith
19459b92b1d3SBarry Smith### The program seems to use more and more memory as it runs, even though you don't think you are allocating more memory
19469b92b1d3SBarry Smith
19479b92b1d3SBarry SmithSome of the following may be occurring:
19489b92b1d3SBarry Smith
19499b92b1d3SBarry Smith- You are creating new PETSc objects but never freeing them.
19509b92b1d3SBarry Smith- There is a memory leak in PETSc or your code.
19519b92b1d3SBarry Smith- Something much more subtle: (if you are using Fortran). When you declare a large array
19529b92b1d3SBarry Smith  in Fortran, the operating system does not allocate all the memory pages for that array
19539b92b1d3SBarry Smith  until you start using the different locations in the array. Thus, in a code, if at each
19549b92b1d3SBarry Smith  step you start using later values in the array your virtual memory usage will "continue"
19559b92b1d3SBarry Smith  to increase as measured by `ps` or `top`.
19569b92b1d3SBarry Smith- You are running with the `-log`, `-log_mpe`, or `-log_all` option. With these
19579b92b1d3SBarry Smith  options, a great deal of logging information is stored in memory until the conclusion of
19589b92b1d3SBarry Smith  the run.
19599b92b1d3SBarry Smith- You are linking with the MPI profiling libraries; these cause logging of all MPI
19609b92b1d3SBarry Smith  activities. Another symptom is at the conclusion of the run it may print some message
19619b92b1d3SBarry Smith  about writing log files.
19629b92b1d3SBarry Smith
19639b92b1d3SBarry SmithThe following may help:
19649b92b1d3SBarry Smith
19659b92b1d3SBarry Smith- Run with the `-malloc_debug` option and `-malloc_view`. Or use `PetscMallocDump()`
19669b92b1d3SBarry Smith  and `PetscMallocView()` sprinkled about your code to track memory that is allocated
19679b92b1d3SBarry Smith  and not later freed. Use the commands `PetscMallocGetCurrentUsage()` and
19689b92b1d3SBarry Smith  `PetscMemoryGetCurrentUsage()` to monitor memory allocated and
19699b92b1d3SBarry Smith  `PetscMallocGetMaximumUsage()` and `PetscMemoryGetMaximumUsage()` for total memory
19709b92b1d3SBarry Smith  used as the code progresses.
19719b92b1d3SBarry Smith- This is just the way Unix works and is harmless.
19729b92b1d3SBarry Smith- Do not use the `-log`, `-log_mpe`, or `-log_all` option, or use
19739b92b1d3SBarry Smith  `PetscLogEventDeactivate()` or `PetscLogEventDeactivateClass()` to turn off logging of
19749b92b1d3SBarry Smith  specific events.
19759b92b1d3SBarry Smith- Make sure you do not link with the MPI profiling libraries.
19769b92b1d3SBarry Smith
19779b92b1d3SBarry Smith### When calling `MatPartitioningApply()` you get a message "Error! Key 16615 Not Found"
19789b92b1d3SBarry Smith
19799b92b1d3SBarry SmithThe graph of the matrix you are using is not symmetric. You must use symmetric matrices
19809b92b1d3SBarry Smithfor partitioning.
19819b92b1d3SBarry Smith
19829b92b1d3SBarry Smith### With GMRES at restart the second residual norm printed does not match the first
19839b92b1d3SBarry Smith
19849b92b1d3SBarry SmithI.e.
19859b92b1d3SBarry Smith
19869b92b1d3SBarry Smith```text
19879b92b1d3SBarry Smith26 KSP Residual norm 3.421544615851e-04
19889b92b1d3SBarry Smith27 KSP Residual norm 2.973675659493e-04
19899b92b1d3SBarry Smith28 KSP Residual norm 2.588642948270e-04
19909b92b1d3SBarry Smith29 KSP Residual norm 2.268190747349e-04
19919b92b1d3SBarry Smith30 KSP Residual norm 1.977245964368e-04
19929b92b1d3SBarry Smith30 KSP Residual norm 1.994426291979e-04 <----- At restart the residual norm is printed a second time
19939b92b1d3SBarry Smith```
19949b92b1d3SBarry Smith
19959b92b1d3SBarry SmithThis is actually not surprising! GMRES computes the norm of the residual at each iteration
19969b92b1d3SBarry Smithvia a recurrence relation between the norms of the residuals at the previous iterations
19979b92b1d3SBarry Smithand quantities computed at the current iteration. It does not compute it via directly
19989b92b1d3SBarry Smith$|| b - A x^{n} ||$.
19999b92b1d3SBarry Smith
20009b92b1d3SBarry SmithSometimes, especially with an ill-conditioned matrix, or computation of the matrix-vector
20019b92b1d3SBarry Smithproduct via differencing, the residual norms computed by GMRES start to "drift" from the
20029b92b1d3SBarry Smithcorrect values. At the restart, we compute the residual norm directly, hence the "strange
20039b92b1d3SBarry Smithstuff," the difference printed. The drifting, if it remains small, is harmless (doesn't
20049b92b1d3SBarry Smithaffect the accuracy of the solution that GMRES computes).
20059b92b1d3SBarry Smith
20069b92b1d3SBarry SmithIf you use a more powerful preconditioner the drift will often be smaller and less
20079b92b1d3SBarry Smithnoticeable. Of if you are running matrix-free you may need to tune the matrix-free
20089b92b1d3SBarry Smithparameters.
20099b92b1d3SBarry Smith
20109b92b1d3SBarry Smith### Why do some Krylov methods seem to print two residual norms per iteration?
20119b92b1d3SBarry Smith
20129b92b1d3SBarry SmithI.e.
20139b92b1d3SBarry Smith
20149b92b1d3SBarry Smith```text
20159b92b1d3SBarry Smith1198 KSP Residual norm 1.366052062216e-04
20169b92b1d3SBarry Smith1198 KSP Residual norm 1.931875025549e-04
20179b92b1d3SBarry Smith1199 KSP Residual norm 1.366026406067e-04
20189b92b1d3SBarry Smith1199 KSP Residual norm 1.931819426344e-04
20199b92b1d3SBarry Smith```
20209b92b1d3SBarry Smith
20219b92b1d3SBarry SmithSome Krylov methods, for example `KSPTFQMR`, actually have a "sub-iteration" of size 2
20229b92b1d3SBarry Smithinside the loop. Each of the two substeps has its own matrix vector product and
20239b92b1d3SBarry Smithapplication of the preconditioner and updates the residual approximations. This is why you
20249b92b1d3SBarry Smithget this "funny" output where it looks like there are two residual norms per
20259b92b1d3SBarry Smithiteration. You can also think of it as twice as many iterations.
20269b92b1d3SBarry Smith
20279b92b1d3SBarry Smith### Unable to locate PETSc dynamic library `libpetsc`
20289b92b1d3SBarry Smith
20299b92b1d3SBarry SmithWhen using DYNAMIC libraries - the libraries cannot be moved after they are
20309b92b1d3SBarry Smithinstalled. This could also happen on clusters - where the paths are different on the (run)
20319b92b1d3SBarry Smithnodes - than on the (compile) front-end. **Do not use dynamic libraries & shared
20329b92b1d3SBarry Smithlibraries**. Run `configure` with
20339b92b1d3SBarry Smith`--with-shared-libraries=0 --with-dynamic-loading=0`.
20349b92b1d3SBarry Smith
20359b92b1d3SBarry Smith:::{important}
2036f0b74427SPierre JolivetThis option has been removed in PETSc v3.5
20379b92b1d3SBarry Smith:::
20389b92b1d3SBarry Smith
20399b92b1d3SBarry Smith### How do I determine what update to PETSc broke my code?
20409b92b1d3SBarry Smith
20419b92b1d3SBarry SmithIf at some point (in PETSc code history) you had a working code - but the latest PETSc
20429b92b1d3SBarry Smithcode broke it, its possible to determine the PETSc code change that might have caused this
20439b92b1d3SBarry Smithbehavior. This is achieved by:
20449b92b1d3SBarry Smith
20459b92b1d3SBarry Smith- Using Git to access PETSc sources
20469b92b1d3SBarry Smith- Knowing the Git commit for the known working version of PETSc
20479b92b1d3SBarry Smith- Knowing the Git commit for the known broken version of PETSc
20489b92b1d3SBarry Smith- Using the [bisect](https://mirrors.edge.kernel.org/pub/software/scm/git/docs/git-bisect.html)
20499b92b1d3SBarry Smith  functionality of Git
20509b92b1d3SBarry Smith
20519b92b1d3SBarry SmithGit bisect can be done as follows:
20529b92b1d3SBarry Smith
20539b92b1d3SBarry Smith1. Get PETSc development (main branch in git) sources
20549b92b1d3SBarry Smith
20559b92b1d3SBarry Smith   ```console
20569b92b1d3SBarry Smith   $ git clone https://gitlab.com/petsc/petsc.git
20579b92b1d3SBarry Smith   ```
20589b92b1d3SBarry Smith
20599b92b1d3SBarry Smith2. Find the good and bad markers to start the bisection process. This can be done either
20609b92b1d3SBarry Smith   by checking `git log` or `gitk` or <https://gitlab.com/petsc/petsc> or the web
20619b92b1d3SBarry Smith   history of petsc-release clones. Lets say the known bad commit is 21af4baa815c and
20629b92b1d3SBarry Smith   known good commit is 5ae5ab319844.
20639b92b1d3SBarry Smith
20649b92b1d3SBarry Smith3. Start the bisection process with these known revisions. Build PETSc, and test your code
20659b92b1d3SBarry Smith   to confirm known good/bad behavior:
20669b92b1d3SBarry Smith
20679b92b1d3SBarry Smith   ```console
20689b92b1d3SBarry Smith   $ git bisect start 21af4baa815c 5ae5ab319844
20699b92b1d3SBarry Smith   ```
20709b92b1d3SBarry Smith
20719b92b1d3SBarry Smith   build/test, perhaps discover that this new state is bad
20729b92b1d3SBarry Smith
20739b92b1d3SBarry Smith   ```console
20749b92b1d3SBarry Smith   $ git bisect bad
20759b92b1d3SBarry Smith   ```
20769b92b1d3SBarry Smith
20779b92b1d3SBarry Smith   build/test, perhaps discover that this state is good
20789b92b1d3SBarry Smith
20799b92b1d3SBarry Smith   ```console
20809b92b1d3SBarry Smith   $ git bisect good
20819b92b1d3SBarry Smith   ```
20829b92b1d3SBarry Smith
20839b92b1d3SBarry Smith   Now until done - keep bisecting, building PETSc, and testing your code with it and
20849b92b1d3SBarry Smith   determine if the code is working or not. After something like 5-15 iterations, `git
20859b92b1d3SBarry Smith   bisect` will pin-point the exact code change that resulted in the difference in
20869b92b1d3SBarry Smith   application behavior.
20879b92b1d3SBarry Smith
20889b92b1d3SBarry Smith:::{tip}
20899b92b1d3SBarry SmithSee [git-bisect(1)](https://mirrors.edge.kernel.org/pub/software/scm/git/docs/git-bisect.html) and the
20909b92b1d3SBarry Smith[debugging section of the Git Book](https://git-scm.com/book/en/Git-Tools-Debugging-with-Git) for more debugging tips.
20919b92b1d3SBarry Smith:::
20929b92b1d3SBarry Smith
20939b92b1d3SBarry Smith### How to fix the error "PMIX Error: error in file gds_ds12_lock_pthread.c"?
20949b92b1d3SBarry Smith
20959b92b1d3SBarry SmithThis seems to be an error when using Open MPI and OpenBLAS with threads (or perhaps other
20969b92b1d3SBarry Smithpackages that use threads).
20979b92b1d3SBarry Smith
20989b92b1d3SBarry Smith```console
20999b92b1d3SBarry Smith$ export PMIX_MCA_gds=hash
21009b92b1d3SBarry Smith```
21019b92b1d3SBarry Smith
21029b92b1d3SBarry SmithShould resolve the problem.
21039b92b1d3SBarry Smith
21049b92b1d3SBarry Smith______________________________________________________________________
21059b92b1d3SBarry Smith
21069b92b1d3SBarry Smith(doc_faq_sharedlibs)=
21079b92b1d3SBarry Smith
21089b92b1d3SBarry Smith## Shared Libraries
21099b92b1d3SBarry Smith
21109b92b1d3SBarry Smith### Can I install PETSc libraries as shared libraries?
21119b92b1d3SBarry Smith
21129b92b1d3SBarry SmithYes. Use
21139b92b1d3SBarry Smith
21149b92b1d3SBarry Smith```console
21159b92b1d3SBarry Smith$ ./configure --with-shared-libraries
21169b92b1d3SBarry Smith```
21179b92b1d3SBarry Smith
21189b92b1d3SBarry Smith### Why should I use shared libraries?
21199b92b1d3SBarry Smith
21209b92b1d3SBarry SmithWhen you link to shared libraries, the function symbols from the shared libraries are not
21219b92b1d3SBarry Smithcopied in the executable. This way the size of the executable is considerably smaller than
21229b92b1d3SBarry Smithwhen using regular libraries. This helps in a couple of ways:
21239b92b1d3SBarry Smith
21249b92b1d3SBarry Smith- Saves disk space when more than one executable is created
21259b92b1d3SBarry Smith- Improves the compile time immensely, because the compiler has to write a much smaller
21269b92b1d3SBarry Smith  file (executable) to the disk.
21279b92b1d3SBarry Smith
21289b92b1d3SBarry Smith### How do I link to the PETSc shared libraries?
21299b92b1d3SBarry Smith
21309b92b1d3SBarry SmithBy default, the compiler should pick up the shared libraries instead of the regular
21319b92b1d3SBarry Smithones. Nothing special should be done for this.
21329b92b1d3SBarry Smith
21339b92b1d3SBarry Smith### What if I want to link to the regular `.a` library files?
21349b92b1d3SBarry Smith
21359b92b1d3SBarry SmithYou must run `configure` with the option `--with-shared-libraries=0` (you can use a
21369b92b1d3SBarry Smithdifferent `$PETSC_ARCH` for this build so you can easily switch between the two).
21379b92b1d3SBarry Smith
21389b92b1d3SBarry Smith### What do I do if I want to move my executable to a different machine?
21399b92b1d3SBarry Smith
21409b92b1d3SBarry SmithYou would also need to have access to the shared libraries on this new machine. The other
21419b92b1d3SBarry Smithalternative is to build the executable without shared libraries by first deleting the
21429b92b1d3SBarry Smithshared libraries, and then creating the executable.
21439b92b1d3SBarry Smith
21449b92b1d3SBarry Smith```{bibliography} /petsc.bib
21459b92b1d3SBarry Smith:filter: docname in docnames
21469b92b1d3SBarry Smith```
2147