xref: /petsc/doc/faq/index.md (revision d000e8ac2eed99d0cf14796aab94de0aed57fd8b)
1(doc_faq)=
2
3# FAQ
4
5```{contents} Table Of Contents
6:backlinks: top
7:local: true
8```
9
10______________________________________________________________________
11
12## General
13
14### How can I subscribe to the PETSc mailing lists?
15
16See mailing list {ref}`documentation <doc_mail>`
17
18### Any useful books on numerical computing?
19
20[Bueler, PETSc for Partial Differential Equations: Numerical Solutions in C and Python](https://my.siam.org/Store/Product/viewproduct/?ProductId=32850137)
21
22[Oliveira and Stewart, Writing Scientific Software: A Guide to Good Style](https://www.cambridge.org/core/books/writing-scientific-software/23206704175AF868E43FE3FB399C2F53)
23
24(doc_faq_general_parallel)=
25
26### What kind of parallel computers or clusters are needed to use PETSc? Or why do I get little speedup?
27
28:::{important}
29PETSc can be used with any kind of parallel system that supports MPI BUT for any decent
30performance one needs:
31
32- Fast, **low-latency** interconnect; any ethernet (even 10 GigE) simply cannot provide
33  the needed performance.
34- High per-core **memory** performance. Each core needs to
35  have its **own** memory bandwidth of at least 2 or more gigabytes/second. Most modern
36  computers are not bottlenecked by how fast they can perform
37  calculations; rather, they are usually restricted by how quickly they can get their
38  data.
39:::
40
41To obtain good performance it is important that you know your machine! I.e. how many
42compute nodes (generally, how many motherboards), how many memory sockets per node and how
43many cores per memory socket and how much memory bandwidth for each.
44
45If you do not know this and can run MPI programs with mpiexec (that is, you don't have
46batch system), run the following from `$PETSC_DIR`:
47
48```console
49$ make streams [NPMAX=maximum_number_of_mpi_processes_you_plan_to_use]
50```
51
52This will provide a summary of the bandwidth with different number of MPI
53processes and potential speedups. See {any}`ch_streams` for a detailed discussion.
54
55If you have a batch system:
56
57```console
58$ cd $PETSC_DIR/src/benchmarks/streams
59$ make MPIVersion
60submit MPIVersion to the batch system a number of times with 1, 2, 3, etc. MPI processes
61collecting all of the output from the runs into the single file scaling.log. Copy
62scaling.log into the src/benchmarks/streams directory.
63$ ./process.py createfile ; process.py
64```
65
66Even if you have enough memory bandwidth if the OS switches processes between cores
67performance can degrade. Smart process to core/socket binding (this just means locking a
68process to a particular core or memory socket) may help you. For example, consider using
69fewer processes than cores and binding processes to separate sockets so that each process
70uses a different memory bus:
71
72- [MPICH2 binding with the Hydra process manager](https://github.com/pmodels/mpich/blob/main/doc/wiki/how_to/Using_the_Hydra_Process_Manager.md#process-core-binding]
73
74  ```console
75  $ mpiexec.hydra -n 4 --binding cpu:sockets
76  ```
77
78- [Open MPI binding](https://www.open-mpi.org/faq/?category=tuning#using-paffinity)
79
80  ```console
81  $ mpiexec -n 4 --map-by socket --bind-to socket --report-bindings
82  ```
83
84- `taskset`, part of the [util-linux](https://github.com/karelzak/util-linux) package
85
86  Check `man taskset` for details. Make sure to set affinity for **your** program,
87  **not** for the `mpiexec` program.
88
89- `numactl`
90
91  In addition to task affinity, this tool also allows changing the default memory affinity
92  policy. On Linux, the default policy is to attempt to find memory on the same memory bus
93  that serves the core that a thread is running on when the memory is faulted
94  (not when `malloc()` is called). If local memory is not available, it is found
95  elsewhere, possibly leading to serious memory imbalances.
96
97  The option `--localalloc` allocates memory on the local NUMA node, similar to the
98  `numa_alloc_local()` function in the `libnuma` library. The option
99  `--cpunodebind=nodes` binds the process to a given NUMA node (note that this can be
100  larger or smaller than a CPU (socket); a NUMA node usually has multiple cores).
101
102  The option `--physcpubind=cpus` binds the process to a given processor core (numbered
103  according to `/proc/cpuinfo`, therefore including logical cores if Hyper-threading is
104  enabled).
105
106  With Open MPI, you can use knowledge of the NUMA hierarchy and core numbering on your
107  machine to calculate the correct NUMA node or processor number given the environment
108  variable `OMPI_COMM_WORLD_LOCAL_RANK`. In most cases, it is easier to make mpiexec or
109  a resource manager set affinities.
110
111The software [Open-MX](http://open-mx.gforge.inria.fr) provides faster speed for
112ethernet systems, we have not tried it but it claims it can dramatically reduce latency
113and increase bandwidth on Linux system. You must first install this software and then
114install MPICH or Open MPI to use it.
115
116### What kind of license is PETSc released under?
117
118See licensing {ref}`documentation <doc_license>`
119
120### Why is PETSc written in C, instead of Fortran or C++?
121
122When this decision was made, in the early 1990s, C enabled us to build data structures
123for storing sparse matrices, solver information,
124etc. in ways that Fortran simply did not allow. ANSI C was a complete standard that all
125modern C compilers supported. The language was identical on all machines. C++ was still
126evolving and compilers on different machines were not identical. Using C function pointers
127to provide data encapsulation and polymorphism allowed us to get many of the advantages of
128C++ without using such a large and more complicated language. It would have been natural and
129reasonable to have coded PETSc in C++; we opted to use C instead.
130
131### Does all the PETSc error checking and logging reduce PETSc's efficiency?
132
133No
134
135(doc_faq_maintenance_strats)=
136
137### How do such a small group of people manage to write and maintain such a large and marvelous package as PETSc?
138
1391. **We work very efficiently**.
140
141   - We use powerful editors and programming environments.
142   - Our manual pages are generated automatically from formatted comments in the code,
143     thus alleviating the need for creating and maintaining manual pages.
144   - We employ continuous integration testing of the entire PETSc library on many different
145     machine architectures. This process **significantly** protects (no bug-catching
146     process is perfect) against inadvertently introducing bugs with new additions. Every
147     new feature **must** pass our suite of thousands of tests as well as formal code
148     review before it may be included.
149
1502. **We are very careful in our design (and are constantly revising our design)**
151
152   - PETSc as a package should be easy to use, write, and maintain. Our mantra is to write
153     code like everyone is using it.
154
1553. **We are willing to do the grunt work**
156
157   - PETSc is regularly checked to make sure that all code conforms to our interface
158     design. We will never keep in a bad design decision simply because changing it will
159     require a lot of editing; we do a lot of editing.
160
1614. **We constantly seek out and experiment with new design ideas**
162
163   - We retain the useful ones and discard the rest. All of these decisions are based not
164     just on performance, but also on **practicality**.
165
1665. **Function and variable names must adhere to strict guidelines**
167
168   - Even the rules about capitalization are designed to make it easy to figure out the
169     name of a particular object or routine. Our memories are terrible, so careful
170     consistent naming puts less stress on our limited human RAM.
171
1726. **The PETSc directory tree is designed to make it easy to move throughout the
173   entire package**
174
1757. **We have a rich, robust, and fast bug reporting system**
176
177   - <mailto:petsc-maint@mcs.anl.gov> is always checked, and we pride ourselves on responding
178     quickly and accurately. Email is very lightweight, and so bug reports system retains
179     an archive of all reported problems and fixes, so it is easy to re-find fixes to
180     previously discovered problems.
181
1828. **We contain the complexity of PETSc by using powerful object-oriented programming
183   techniques**
184
185   - Data encapsulation serves to abstract complex data formats or movement to
186     human-readable format. This is why your program cannot, for example, look directly
187     at what is inside the object `Mat`.
188   - Polymorphism makes changing program behavior as easy as possible, and further
189     abstracts the *intent* of your program from what is *written* in code. You call
190     `MatMult()` regardless of whether your matrix is dense, sparse, parallel or
191     sequential; you don't call a different routine for each format.
192
1939. **We try to provide the functionality requested by our users**
194
195### For complex numbers will I get better performance with C++?
196
197To use PETSc with complex numbers you may use the following `configure` options;
198`--with-scalar-type=complex` and either `--with-clanguage=c++` or (the default)
199`--with-clanguage=c`. In our experience they will deliver very similar performance
200(speed), but if one is concerned they should just try both and see if one is faster.
201
202### How come when I run the same program on the same number of processes I get a "different" answer?
203
204Inner products and norms in PETSc are computed using the `MPI_Allreduce()` command. For
205different runs the order at which values arrive at a given process (via MPI) can be in a
206different order, thus the order in which some floating point arithmetic operations are
207performed will be different. Since floating point arithmetic is not
208associative, the computed quantity may be slightly different.
209
210Over a run the many slight differences in the inner products and norms will effect all the
211computed results. It is important to realize that none of the computed answers are any
212less right or wrong (in fact the sequential computation is no more right then the parallel
213ones). All answers are equal, but some are more equal than others.
214
215The discussion above assumes that the exact same algorithm is being used on the different
216number of processes. When the algorithm is different for the different number of processes
217(almost all preconditioner algorithms except Jacobi are different for different number of
218processes) then one expects to see (and does) a greater difference in results for
219different numbers of processes. In some cases (for example block Jacobi preconditioner) it
220may be that the algorithm works for some number of processes and does not work for others.
221
222### How come when I run the same linear solver on a different number of processes it takes a different number of iterations?
223
224The convergence of many of the preconditioners in PETSc including the default parallel
225preconditioner block Jacobi depends on the number of processes. The more processes the
226(slightly) slower convergence it has. This is the nature of iterative solvers, the more
227parallelism means the more "older" information is used in the solution process hence
228slower convergence.
229
230(doc_faq_gpuhowto)=
231
232### Can PETSc use GPUs to speed up computations?
233
234:::{seealso}
235See GPU development {ref}`roadmap <doc_gpu_roadmap>` for the latest information
236regarding the state of PETSc GPU integration.
237
238See GPU install {ref}`documentation <doc_config_accel>` for up-to-date information on
239installing PETSc to use GPU's.
240:::
241
242Quick summary of usage with CUDA:
243
244- The `VecType` `VECSEQCUDA`, `VECMPICUDA`, or `VECCUDA` may be used with
245  `VecSetType()` or `-vec_type seqcuda`, `mpicuda`, or `cuda` when
246  `VecSetFromOptions()` is used.
247- The `MatType` `MATSEQAIJCUSPARSE`, `MATMPIAIJCUSPARSE`, or `MATAIJCUSPARSE`
248  may be used with `MatSetType()` or `-mat_type seqaijcusparse`, `mpiaijcusparse`, or
249  `aijcusparse` when `MatSetFromOptions()` is used.
250- If you are creating the vectors and matrices with a `DM`, you can use `-dm_vec_type
251  cuda` and `-dm_mat_type aijcusparse`.
252
253Quick summary of usage with OpenCL (provided by the ViennaCL library):
254
255- The `VecType` `VECSEQVIENNACL`, `VECMPIVIENNACL`, or `VECVIENNACL` may be used
256  with `VecSetType()` or `-vec_type seqviennacl`, `mpiviennacl`, or `viennacl`
257  when `VecSetFromOptions()` is used.
258- The `MatType` `MATSEQAIJVIENNACL`, `MATMPIAIJVIENNACL`, or `MATAIJVIENNACL`
259  may be used with `MatSetType()` or `-mat_type seqaijviennacl`, `mpiaijviennacl`, or
260  `aijviennacl` when `MatSetFromOptions()` is used.
261- If you are creating the vectors and matrices with a `DM`, you can use `-dm_vec_type
262  viennacl` and `-dm_mat_type aijviennacl`.
263
264General hints:
265
266- It is useful to develop your code with the default vectors and then run production runs
267  with the command line options to use the GPU since debugging on GPUs is difficult.
268- All of the Krylov methods except `KSPIBCGS` run on the GPU.
269- Parts of most preconditioners run directly on the GPU. After setup, `PCGAMG` runs
270  fully on GPUs, without any memory copies between the CPU and GPU.
271
272Some GPU systems (for example many laptops) only run with single precision; thus, PETSc
273must be built with the `configure` option `--with-precision=single`.
274
275(doc_faq_extendedprecision)=
276
277### Can I run PETSc with extended precision?
278
279Yes, with gcc and gfortran. `configure` PETSc using the
280options `--with-precision=__float128` and `--download-f2cblaslapack`.
281
282:::{admonition} Warning
283:class: yellow
284
285External packages are not guaranteed to work in this mode!
286:::
287
288### Why doesn't PETSc use Qd to implement support for extended precision?
289
290We tried really hard but could not. The problem is that the QD c++ classes, though they
291try to, implement the built-in data types of `double` are not native types and cannot
292"just be used" in a general piece of numerical source code. Ratherm the code has to
293rewritten to live within the limitations of QD classes. However PETSc can be built to use
294quad precision, as detailed {ref}`here <doc_faq_extendedprecision>`.
295
296### How do I cite PETSc?
297
298Use {any}`these citations <doc_index_citing_petsc>`.
299
300______________________________________________________________________
301
302## Installation
303
304### How do I begin using PETSc if the software has already been completely built and installed by someone else?
305
306Assuming that the PETSc libraries have been successfully built for a particular
307architecture and level of optimization, a new user must merely:
308
3091. Set `PETSC_DIR` to the full path of the PETSc home
310   directory. This will be the location of the `configure` script, and usually called
311   "petsc" or some variation of that (for example, /home/username/petsc).
3122. Set `PETSC_ARCH`, which indicates the configuration on which PETSc will be
313   used. Note that `$PETSC_ARCH` is simply a name the installer used when installing
314   the libraries. There will exist a directory within `$PETSC_DIR` that is named after
315   its corresponding `$PETSC_ARCH`. There many be several on a single system, for
316   example "linux-c-debug" for the debug versions compiled by a C compiler or
317   "linux-c-opt" for the optimized version.
318
319:::{admonition} Still Stuck?
320See the {ref}`quick-start tutorial <tut_install>` for a step-by-step guide on
321installing PETSc, in case you have missed a step.
322
323See the users manual section on {ref}`getting started <sec_getting_started>`.
324:::
325
326### The PETSc distribution is SO Large. How can I reduce my disk space usage?
327
3281. The PETSc users manual is provided in PDF format at `$PETSC_DIR/manual.pdf`. You
329   can delete this.
3302. The PETSc test suite contains sample output for many of the examples. These are
331   contained in the PETSc directories `$PETSC_DIR/src/*/tutorials/output` and
332   `$PETSC_DIR/src/*/tests/output`. Once you have run the test examples, you may remove
333   all of these directories to save some disk space. You can locate the largest with
334   e.g. `find . -name output -type d | xargs du -sh | sort -hr` on a Unix-based system.
3353. The debugging versions of the libraries are larger than the optimized versions. In a
336   pinch you can work with the optimized version, although we bid you good luck in
337   finding bugs as it is much easier with the debug version.
338
339### I want to use PETSc only for uniprocessor programs. Must I still install and use a version of MPI?
340
341No, run `configure` with the option `--with-mpi=0`
342
343### Can I install PETSc to not use X windows (either under Unix or Microsoft Windows with GCC)?
344
345Yes. Run `configure` with the additional flag `--with-x=0`
346
347### Why do you use MPI?
348
349MPI is the message-passing standard. Because it is a standard, it will not frequently change over
350time; thus, we do not have to change PETSc every time the provider of the message-passing
351system decides to make an interface change. MPI was carefully designed by experts from
352industry, academia, and government labs to provide the highest quality performance and
353capability.
354
355For example, the careful design of communicators in MPI allows the easy nesting of
356different libraries; no other message-passing system provides this support. All of the
357major parallel computer vendors were involved in the design of MPI and have committed to
358providing quality implementations.
359
360In addition, since MPI is a standard, several different groups have already provided
361complete free implementations. Thus, one does not have to rely on the technical skills of
362one particular group to provide the message-passing libraries. Today, MPI is the only
363practical, portable approach to writing efficient parallel numerical software.
364
365(invalid_mpi_compilers)=
366
367### What do I do if my MPI compiler wrappers are invalid?
368
369Most MPI implementations provide compiler wrappers (such as `mpicc`) which give the
370include and link options necessary to use that version of MPI to the underlying compilers
371. Configuration will fail if these wrappers are either absent or broken in the MPI pointed to by
372`--with-mpi-dir`. You can rerun the configure with the additional option
373`--with-mpi-compilers=0`, which will try to auto-detect working compilers; however,
374these compilers may be incompatible with the particular MPI build. If this fix does not
375work, run with `--with-cc=[your_c_compiler]` where you know `your_c_compiler` works
376with this particular MPI, and likewise for C++ (`--with-cxx=[your_cxx_compiler]`) and Fortran
377(`--with-fc=[your_fortran_compiler]`).
378
379(bit_indices)=
380
381### When should/can I use the `configure` option `--with-64-bit-indices`?
382
383By default the type that PETSc uses to index into arrays and keep sizes of arrays is a
384`PetscInt` defined to be a 32-bit `int`. If your problem:
385
386- Involves more than 2^31 - 1 unknowns (around 2 billion).
387- Your matrix might contain more than 2^31 - 1 nonzeros on a single process.
388
389Then you need to use this option. Otherwise you will get strange crashes.
390
391This option can be used when you are using either 32 or 64-bit pointers. You do not
392need to use this option if you are using 64-bit pointers unless the two conditions above
393hold.
394
395### What if I get an internal compiler error?
396
397You can rebuild the offending file individually with a lower optimization level. **Then
398make sure to complain to the compiler vendor and file a bug report**. For example, if the
399compiler chokes on `src/mat/impls/baij/seq/baijsolvtrannat.c` you can run the following
400from `$PETSC_DIR`:
401
402```console
403$ make -f gmakefile PCC_FLAGS="-O1" $PETSC_ARCH/obj/src/mat/impls/baij/seq/baijsolvtrannat.o
404$ make all
405```
406
407### How do I enable Python bindings (petsc4py) with PETSc?
408
4091. Install [Cython](https://cython.org/).
4102. `configure` PETSc with the `--with-petsc4py=1` option.
4113. set `PYTHONPATH=$PETSC_DIR/$PETSC_ARCH/lib`
412
413(macos_gfortran)=
414
415### What Fortran compiler do you recommend on macOS?
416
417We recommend using [homebrew](https://brew.sh/) to install [gfortran](https://gcc.gnu.org/wiki/GFortran), see {any}`doc_macos_install`
418
419### How can I find the URL locations of the packages you install using `--download-PACKAGE`?
420
421```console
422$ grep "self.download " $PETSC_DIR/config/BuildSystem/config/packages/*.py
423```
424
425### How to fix the problem: PETSc was configured with one MPICH (or Open MPI) `mpi.h` version but now appears to be compiling using a different MPICH (or Open MPI) `mpi.h` version
426
427This happens for generally one of two reasons:
428
429- You previously ran `configure` with the option `--download-mpich` (or `--download-openmpi`)
430  but later ran `configure` to use a version of MPI already installed on the
431  machine. Solution:
432
433  ```console
434  $ rm -rf $PETSC_DIR/$PETSC_ARCH
435  $ ./configure --your-args
436  ```
437
438(mpi_network_misconfigure)=
439
440### What does it mean when `make check` hangs or errors on `PetscOptionsInsertFile()`?
441
442For example:
443
444```none
445Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes
446See https://petsc.org/release/faq/
447[0]PETSC ERROR: #1 PetscOptionsInsertFile() line 563 in /Users/barrysmith/Src/PETSc/src/sys/objects/options.c
448[0]PETSC ERROR: #2 PetscOptionsInsert() line 720 in /Users/barrysmith/Src/PETSc/src/sys/objects/options.c
449[0]PETSC ERROR: #3 PetscInitialize() line 828 in /Users/barrysmith/Src/PETSc/src/sys/objects/pinit.c
450```
451
452or
453
454```none
455$ make check
456Running check examples to verify correct installation
457Using PETSC_DIR=/Users/barrysmith/Src/petsc and PETSC_ARCH=arch-fix-mpiexec-hang-2-ranks
458C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
459PROGRAM SEEMS TO BE HANGING HERE
460```
461
462This usually occurs when network settings are misconfigured (perhaps due to VPN) resulting in a failure or hang in system call `gethostbyname()`.
463
464- Verify you are using the correct `mpiexec` for the MPI you have linked PETSc with.
465
466- If you have a VPN enabled on your machine, try turning it off and then running `make check` to
467  verify that it is not the VPN playing poorly with MPI.
468
469- If  ``ping `hostname` `` (`/sbin/ping` on macOS) fails or hangs do:
470
471  ```none
472  echo 127.0.0.1 `hostname` | sudo tee -a /etc/hosts
473  ```
474
475  and try `make check` again.
476
477- Try completely disconnecting your machine from the network and see if `make check` then works
478
479- Try the PETSc `configure` option `--download-mpich-device=ch3:nemesis` with `--download-mpich`.
480
481______________________________________________________________________
482
483## Usage
484
485### How can I redirect PETSc's `stdout` and `stderr` when programming with a GUI interface in Windows Developer Studio or to C++ streams?
486
487To overload just the error messages write your own `MyPrintError()` function that does
488whatever you want (including pop up windows etc) and use it like below.
489
490```c
491extern "C" {
492  int PASCAL WinMain(HINSTANCE,HINSTANCE,LPSTR,int);
493};
494
495#include <petscsys.h>
496#include <mpi.h>
497
498const char help[] = "Set up from main";
499
500int MyPrintError(const char error[], ...)
501{
502  printf("%s", error);
503  return 0;
504}
505
506int main(int ac, char *av[])
507{
508  char           buf[256];
509  HINSTANCE      inst;
510  PetscErrorCode ierr;
511
512  inst = (HINSTANCE)GetModuleHandle(NULL);
513  PetscErrorPrintf = MyPrintError;
514
515  buf[0] = 0;
516  for (int i = 1; i < ac; ++i) {
517    strcat(buf, av[i]);
518    strcat(buf, " ");
519  }
520
521  PetscCall(PetscInitialize(&ac, &av, NULL, help));
522
523  return WinMain(inst, NULL, buf, SW_SHOWNORMAL);
524}
525```
526
527Place this file in the project and compile with this preprocessor definitions:
528
529```
530WIN32
531_DEBUG
532_CONSOLE
533_MBCS
534USE_PETSC_LOG
535USE_PETSC_BOPT_g
536USE_PETSC_STACK
537_AFXDLL
538```
539
540And these link options:
541
542```
543/nologo
544/subsystem:console
545/incremental:yes
546/debug
547/machine:I386
548/nodefaultlib:"libcmtd.lib"
549/nodefaultlib:"libcd.lib"
550/nodefaultlib:"mvcrt.lib"
551/pdbtype:sept
552```
553
554:::{note}
555The above is compiled and linked as if it was a console program. The linker will search
556for a main, and then from it the `WinMain` will start. This works with MFC templates and
557derived classes too.
558
559When writing a Window's console application you do not need to do anything, the `stdout`
560and `stderr` is automatically output to the console window.
561:::
562
563To change where all PETSc `stdout` and `stderr` go, (you can also reassign
564`PetscVFPrintf()` to handle `stdout` and `stderr` any way you like) write the
565following function:
566
567```
568PetscErrorCode mypetscvfprintf(FILE *fd, const char format[], va_list Argp)
569{
570  PetscFunctionBegin;
571  if (fd != stdout && fd != stderr) { /* handle regular files */
572    PetscCall(PetscVFPrintfDefault(fd, format, Argp));
573  } else {
574    char buff[1024]; /* Make sure to assign a large enough buffer */
575    int  length;
576
577    PetscCall(PetscVSNPrintf(buff, 1024, format, &length, Argp));
578
579    /* now send buff to whatever stream or whatever you want */
580  }
581  PetscFunctionReturn(PETSC_SUCCESS);
582}
583```
584
585Then assign `PetscVFPrintf = mypetscprintf` before `PetscInitialize()` in your main
586program.
587
588### I want to use Hypre boomerAMG without GMRES but when I run `-pc_type hypre -pc_hypre_type boomeramg -ksp_type preonly` I don't get a very accurate answer!
589
590You should run with `-ksp_type richardson` to have PETSc run several V or W
591cycles. `-ksp_type preonly` causes boomerAMG to use only one V/W cycle. You can control
592how many cycles are used in a single application of the boomerAMG preconditioner with
593`-pc_hypre_boomeramg_max_iter <it>` (the default is 1). You can also control the
594tolerance boomerAMG uses to decide if to stop before `max_iter` with
595`-pc_hypre_boomeramg_tol <tol>` (the default is 1.e-7). Run with `-ksp_view` to see
596all the hypre options used and `-help | grep boomeramg` to see all the command line
597options.
598
599### How do I use PETSc for Domain Decomposition?
600
601PETSc includes Additive Schwarz methods in the suite of preconditioners under the umbrella
602of `PCASM`. These may be activated with the runtime option `-pc_type asm`. Various
603other options may be set, including the degree of overlap `-pc_asm_overlap <number>` the
604type of restriction/extension `-pc_asm_type [basic,restrict,interpolate,none]` sets ASM
605type and several others. You may see the available ASM options by using `-pc_type asm
606-help`. See the procedural interfaces in the manual pages, for example `PCASMType()`
607and check the index of the users manual for `PCASMCreateSubdomains()`.
608
609PETSc also contains a domain decomposition inspired wirebasket or face based two level
610method where the coarse mesh to fine mesh interpolation is defined by solving specific
611local subdomain problems. It currently only works for 3D scalar problems on structured
612grids created with PETSc `DMDA`. See the manual page for `PCEXOTIC` and
613`src/ksp/ksp/tutorials/ex45.c` for an example.
614
615PETSc also contains a balancing Neumann-Neumann type preconditioner, see the manual page
616for `PCBDDC`. This requires matrices be constructed with `MatCreateIS()` via the finite
617element method. See `src/ksp/ksp/tests/ex59.c` for an example.
618
619### You have `MATAIJ` and `MATBAIJ` matrix formats, and `MATSBAIJ` for symmetric storage, how come no `MATSAIJ`?
620
621Just for historical reasons; the `MATSBAIJ` format with blocksize one is just as efficient as
622a `MATSAIJ` would be.
623
624### Can I Create BAIJ matrices with different size blocks for different block rows?
625
626No. The `MATBAIJ` format only supports a single fixed block size on the entire
627matrix. But the `MATAIJ` format automatically searches for matching rows and thus still
628takes advantage of the natural blocks in your matrix to obtain good performance.
629
630:::{note}
631If you use `MATAIJ` you cannot use the `MatSetValuesBlocked()`.
632:::
633
634### How do I access the values of a remote parallel PETSc Vec?
635
6361. On each process create a local `Vec` large enough to hold all the values it wishes to
637   access.
6382. Create a `VecScatter` that scatters from the parallel `Vec` into the local `Vec`.
6393. Use `VecGetArray()` to access the values in the local `Vec`.
640
641For example, assuming we have distributed a vector `vecGlobal` of size $N$ to
642$R$ ranks and each remote rank holds $N/R = m$ values (similarly assume that
643$N$ is cleanly divisible by $R$). We want each rank $r$ to gather the
644first $n$ (also assume $n \leq m$) values from its immediately superior neighbor
645$r+1$ (final rank will retrieve from rank 0).
646
647```
648Vec            vecLocal;
649IS             isLocal, isGlobal;
650VecScatter     ctx;
651PetscScalar    *arr;
652PetscInt       N, firstGlobalIndex;
653MPI_Comm       comm;
654PetscMPIInt    r, R;
655
656/* Create sequential local vector, big enough to hold local portion */
657PetscCall(VecCreateSeq(PETSC_COMM_SELF, n, &vecLocal));
658
659/* Create IS to describe where we want to scatter to */
660PetscCall(ISCreateStride(PETSC_COMM_SELF, n, 0, 1, &isLocal));
661
662/* Compute the global indices */
663PetscCall(VecGetSize(vecGlobal, &N));
664PetscCall(PetscObjectGetComm((PetscObject) vecGlobal, &comm));
665PetscCallMPI(MPI_Comm_rank(comm, &r));
666PetscCallMPI(MPI_Comm_size(comm, &R));
667firstGlobalIndex = r == R-1 ? 0 : (N/R)*(r+1);
668
669/* Create IS that describes where we want to scatter from */
670PetscCall(ISCreateStride(comm, n, firstGlobalIndex, 1, &isGlobal));
671
672/* Create the VecScatter context */
673PetscCall(VecScatterCreate(vecGlobal, isGlobal, vecLocal, isLocal, &ctx));
674
675/* Gather the values */
676PetscCall(VecScatterBegin(ctx, vecGlobal, vecLocal, INSERT_VALUES, SCATTER_FORWARD));
677PetscCall(VecScatterEnd(ctx, vecGlobal, vecLocal, INSERT_VALUES, SCATTER_FORWARD));
678
679/* Retrieve and do work */
680PetscCall(VecGetArray(vecLocal, &arr));
681/* Work */
682PetscCall(VecRestoreArray(vecLocal, &arr));
683
684/* Don't forget to clean up */
685PetscCall(ISDestroy(&isLocal));
686PetscCall(ISDestroy(&isGlobal));
687PetscCall(VecScatterDestroy(&ctx));
688PetscCall(VecDestroy(&vecLocal));
689```
690
691(doc_faq_usage_alltoone)=
692
693### How do I collect to a single processor all the values from a parallel PETSc Vec?
694
6951. Create the `VecScatter` context that will do the communication:
696
697   ```
698   Vec        in_par,out_seq;
699   VecScatter ctx;
700
701   PetscCall(VecScatterCreateToAll(in_par, &ctx, &out_seq));
702   ```
703
7042. Initiate the communication (this may be repeated if you wish):
705
706   ```
707   PetscCall(VecScatterBegin(ctx, in_par, out_seq, INSERT_VALUES, SCATTER_FORWARD));
708   PetscCall(VecScatterEnd(ctx, in_par, out_seq, INSERT_VALUES, SCATTER_FORWARD));
709   /* May destroy context now if no additional scatters are needed, otherwise reuse it */
710   PetscCall(VecScatterDestroy(&ctx));
711   ```
712
713Note that this simply concatenates in the parallel ordering of the vector (computed by the
714`MPI_Comm_rank` of the owning process). If you are using a `Vec` from
715`DMCreateGlobalVector()` you likely want to first call `DMDAGlobalToNaturalBegin()`
716followed by `DMDAGlobalToNaturalEnd()` to scatter the original `Vec` into the natural
717ordering in a new global `Vec` before calling `VecScatterBegin()`/`VecScatterEnd()`
718to scatter the natural `Vec` onto all processes.
719
720### How do I collect all the values from a parallel PETSc Vec on the 0th rank?
721
722See FAQ entry on collecting to {ref}`an arbitrary processor <doc_faq_usage_alltoone>`, but
723replace
724
725```
726PetscCall(VecScatterCreateToAll(in_par, &ctx, &out_seq));
727```
728
729with
730
731```
732PetscCall(VecScatterCreateToZero(in_par, &ctx, &out_seq));
733```
734
735:::{note}
736The same ordering considerations as discussed in the aforementioned entry also apply
737here.
738:::
739
740### How can I read in or write out a sparse matrix in Matrix Market, Harwell-Boeing, Slapc or other ASCII format?
741
742If you can read or write your matrix using Python or MATLAB/Octave, `PetscBinaryIO`
743modules are provided at `$PETSC_DIR/lib/petsc/bin` for each language that can assist
744with reading and writing. If you just want to convert `MatrixMarket`, you can use:
745
746```console
747$ python -m $PETSC_DIR/lib/petsc/bin/PetscBinaryIO convert matrix.mtx
748```
749
750To produce `matrix.petsc`.
751
752You can also call the script directly or import it from your Python code. There is also a
753`PETScBinaryIO.jl` Julia package.
754
755For other formats, either adapt one of the above libraries or see the examples in
756`$PETSC_DIR/src/mat/tests`, specifically `ex72.c` or `ex78.c`. You will likely need
757to modify the code slightly to match your required ASCII format.
758
759:::{note}
760Never read or write in parallel an ASCII matrix file.
761
762Instead read in sequentially with a standalone code based on `ex72.c` or `ex78.c`
763then save the matrix with the binary viewer `PetscViewerBinaryOpen()` and load the
764matrix in parallel in your "real" PETSc program with `MatLoad()`.
765
766For writing save with the binary viewer and then load with the sequential code to store
767it as ASCII.
768:::
769
770### Does TSSetFromOptions(), SNESSetFromOptions(), or KSPSetFromOptions() reset all the parameters I previously set or how come do they not seem to work?
771
772If `XXSetFromOptions()` is used (with `-xxx_type aaaa`) to change the type of the
773object then all parameters associated with the previous type are removed. Otherwise it
774does not reset parameters.
775
776`TS/SNES/KSPSetXXX()` commands that set properties for a particular type of object (such
777as `KSPGMRESSetRestart()`) ONLY work if the object is ALREADY of that type. For example,
778with
779
780```
781KSP ksp;
782
783PetscCall(KSPCreate(PETSC_COMM_WORLD, &ksp));
784PetscCall(KSPGMRESSetRestart(ksp, 10));
785```
786
787the restart will be ignored since the type has not yet been set to `KSPGMRES`. To have
788those values take effect you should do one of the following:
789
790- Allow setting the type from the command line, if it is not on the command line then the
791  default type is automatically set.
792
793```
794/* Create generic object */
795XXXCreate(..,&obj);
796/* Must set all settings here, or default */
797XXXSetFromOptions(obj);
798```
799
800- Hardwire the type in the code, but allow the user to override it via a subsequent
801  `XXXSetFromOptions()` call. This essentially allows the user to customize what the
802  "default" type to of the object.
803
804```
805/* Create generic object */
806XXXCreate(..,&obj);
807/* Set type directly */
808XXXSetYYYYY(obj,...);
809/* Can always change to different type */
810XXXSetFromOptions(obj);
811```
812
813### How do I compile and link my own PETSc application codes and can I use my own `makefile` or rules for compiling code, rather than PETSc's?
814
815See the {ref}`section <sec_writing_application_codes>` of the users manual on writing
816application codes with PETSc.
817
818### Can I use CMake to build my own project that depends on PETSc?
819
820See the {ref}`section <sec_writing_application_codes>` of the users manual on writing
821application codes with PETSc.
822
823### How can I put carriage returns in `PetscPrintf()` statements from Fortran?
824
825You can use the same notation as in C, just put a `\n` in the string. Note that no other C
826format instruction is supported.
827
828Or you can use the Fortran concatenation `//` and `char(10)`; for example `'some
829string'//char(10)//'another string` on the next line.
830
831### How can I implement callbacks using C++ class methods?
832
833Declare the class method static. Static methods do not have a `this` pointer, but the
834`void*` context parameter will usually be cast to a pointer to the class where it can
835serve the same function.
836
837:::{admonition} Remember
838All PETSc callbacks return `PetscErrorCode`.
839:::
840
841### Everyone knows that when you code Newton's Method you should compute the function and its Jacobian at the same time. How can one do this in PETSc?
842
843The update in Newton's method is computed as
844
845$$
846u^{n+1} = u^n - \lambda * \left[J(u^n)] * F(u^n) \right]^{\dagger}
847$$
848
849The reason PETSc doesn't default to computing both the function and Jacobian at the same
850time is:
851
852- In order to do the line search $F \left(u^n - \lambda * \text{step} \right)$ may
853  need to be computed for several $\lambda$. The Jacobian is not needed for each of
854  those and one does not know in advance which will be the final $\lambda$ until
855  after the function value is computed, so many extra Jacobians may be computed.
856- In the final step if $|| F(u^p)||$ satisfies the convergence criteria then a
857  Jacobian need not be computed.
858
859You are free to have your `FormFunction()` compute as much of the Jacobian at that point
860as you like, keep the information in the user context (the final argument to
861`FormFunction()` and `FormJacobian()`) and then retrieve the information in your
862`FormJacobian()` function.
863
864### Computing the Jacobian or preconditioner is time consuming. Is there any way to compute it less often?
865
866PETSc has a variety of ways of lagging the computation of the Jacobian or the
867preconditioner. They are documented in the manual page for `SNESComputeJacobian()`
868and in the {ref}`users manual <ch_snes>`:
869
870-s
871
872nes_lag_jacobian
873
874(`SNESSetLagJacobian()`) How often Jacobian is rebuilt (use -1 to
875never rebuild, use -2 to rebuild the next time requested and then
876never again).
877
878-s
879
880nes_lag_jacobian_persists
881
882(`SNESSetLagJacobianPersists()`) Forces lagging of Jacobian
883through multiple `SNES` solves , same as passing -2 to
884`-snes_lag_jacobian`. By default, each new `SNES` solve
885normally triggers a recomputation of the Jacobian.
886
887-s
888
889nes_lag_preconditioner
890
891(`SNESSetLagPreconditioner()`) how often the preconditioner is
892rebuilt. Note: if you are lagging the Jacobian the system will
893know that the matrix has not changed and will not recompute the
894(same) preconditioner.
895
896-s
897
898nes_lag_preconditioner_persists
899
900(`SNESSetLagPreconditionerPersists()`) Preconditioner
901lags through multiple `SNES` solves
902
903:::{note}
904These are often (but does not need to be) used in combination with
905`-snes_mf_operator` which applies the fresh Jacobian matrix-free for every
906matrix-vector product. Otherwise the out-of-date matrix vector product, computed with
907the lagged Jacobian will be used.
908:::
909
910By using `KSPMonitorSet()` and/or `SNESMonitorSet()` one can provide code that monitors the
911convergence rate and automatically triggers an update of the Jacobian or preconditioner
912based on decreasing convergence of the iterative method. For example if the number of `SNES`
913iterations doubles one might trigger a new computation of the Jacobian. Experimentation is
914the only general purpose way to determine which approach is best for your problem.
915
916:::{important}
917It is also vital to experiment on your true problem at the scale you will be solving
918the problem since the performance benefits depend on the exact problem and the problem
919size!
920:::
921
922### How can I use Newton's Method Jacobian free? Can I difference a different function than provided with `SNESSetFunction()`?
923
924The simplest way is with the option `-snes_mf`, this will use finite differencing of the
925function provided to `SNESComputeFunction()` to approximate the action of Jacobian.
926
927:::{important}
928Since no matrix-representation of the Jacobian is provided the `-pc_type` used with
929this option must be `-pc_type none`. You may provide a custom preconditioner with
930`SNESGetKSP()`, `KSPGetPC()`, and `PCSetType()` and use `PCSHELL`.
931:::
932
933The option `-snes_mf_operator` will use Jacobian free to apply the Jacobian (in the
934Krylov solvers) but will use whatever matrix you provided with `SNESSetJacobian()`
935(assuming you set one) to compute the preconditioner.
936
937To write the code (rather than use the options above) use `MatCreateSNESMF()` and pass
938the resulting matrix object to `SNESSetJacobian()`.
939
940For purely matrix-free (like `-snes_mf`) pass the matrix object for both matrix
941arguments and pass the function `MatMFFDComputeJacobian()`.
942
943To provide your own approximate Jacobian matrix to compute the preconditioner (like
944`-snes_mf_operator`), pass this other matrix as the second matrix argument to
945`SNESSetJacobian()`. Make sure your provided `computejacobian()` function calls
946`MatAssemblyBegin()` and `MatAssemblyEnd()` separately on **BOTH** matrix arguments
947to this function. See `src/snes/tests/ex7.c`.
948
949To difference a different function than that passed to `SNESSetJacobian()` to compute the
950matrix-free Jacobian multiply call `MatMFFDSetFunction()` to set that other function. See
951`src/snes/tests/ex7.c.h`.
952
953(doc_faq_usage_condnum)=
954
955### How can I determine the condition number of a matrix?
956
957For small matrices, the condition number can be reliably computed using
958
959```text
960-pc_type svd -pc_svd_monitor
961```
962
963For larger matrices, you can run with
964
965```text
966-pc_type none -ksp_type gmres -ksp_monitor_singular_value -ksp_gmres_restart 1000
967```
968
969to get approximations to the condition number of the operator. This will generally be
970accurate for the largest singular values, but may overestimate the smallest singular value
971unless the method has converged. Make sure to avoid restarts. To estimate the condition
972number of the preconditioned operator, use `-pc_type somepc` in the last command.
973
974You can use [SLEPc](https://slepc.upv.es) for highly scalable, efficient, and quality eigenvalue computations.
975
976### How can I compute the inverse of a matrix in PETSc?
977
978:::{admonition} Are you sure?
979:class: yellow
980
981It is very expensive to compute the inverse of a matrix and very rarely needed in
982practice. We highly recommend avoiding algorithms that need it.
983:::
984
985The inverse of a matrix (dense or sparse) is essentially always dense, so begin by
986creating a dense matrix B and fill it with the identity matrix (ones along the diagonal),
987also create a dense matrix X of the same size that will hold the solution. Then factor the
988matrix you wish to invert with `MatLUFactor()` or `MatCholeskyFactor()`, call the
989result A. Then call `MatMatSolve(A,B,X)` to compute the inverse into X. See also section
990on {any}`Schur's complement <how_can_i_compute_the_schur_complement>`.
991
992(how_can_i_compute_the_schur_complement)=
993
994### How can I compute the Schur complement in PETSc?
995
996:::{admonition} Are you sure?
997:class: yellow
998
999It is very expensive to compute the Schur complement of a matrix and very rarely needed
1000in practice. We highly recommend avoiding algorithms that need it.
1001:::
1002
1003The Schur complement of the matrix $M \in \mathbb{R}^{\left(p+q \right) \times
1004\left(p + q \right)}$
1005
1006$$
1007M = \begin{bmatrix}
1008A & B \\
1009C & D
1010\end{bmatrix}
1011$$
1012
1013where
1014
1015$$
1016A \in \mathbb{R}^{p \times p}, \quad B \in \mathbb{R}^{p \times q}, \quad C \in \mathbb{R}^{q \times p}, \quad D \in \mathbb{R}^{q \times q}
1017$$
1018
1019is given by
1020
1021$$
1022S_D := A - BD^{-1}C \\
1023S_A := D - CA^{-1}B
1024$$
1025
1026Like the inverse, the Schur complement of a matrix (dense or sparse) is essentially always
1027dense, so assuming you wish to calculate $S_A = D - C \underbrace{
1028\overbrace{(A^{-1})}^{U} B}_{V}$ begin by:
1029
10301. Forming a dense matrix $B$
10312. Also create another dense matrix $V$ of the same size.
10323. Then either factor the matrix $A$ directly with `MatLUFactor()` or
1033   `MatCholeskyFactor()`, or use `MatGetFactor()` followed by
1034   `MatLUFactorSymbolic()` followed by `MatLUFactorNumeric()` if you wish to use and
1035   external solver package like SuperLU_Dist. Call the result $U$.
10364. Then call `MatMatSolve(U,B,V)`.
10375. Then call `MatMatMult(C,V,MAT_INITIAL_MATRIX,1.0,&S)`.
10386. Now call `MatAXPY(S,-1.0,D,MAT_SUBSET_NONZERO)`.
10397. Followed by `MatScale(S,-1.0)`.
1040
1041For computing Schur complements like this it does not make sense to use the `KSP`
1042iterative solvers since for solving many moderate size problems using a direct
1043factorization is much faster than iterative solvers. As you can see, this requires a great
1044deal of work space and computation so is best avoided.
1045
1046However, it is not necessary to assemble the Schur complement $S$ in order to solve
1047systems with it. Use `MatCreateSchurComplement(A,A_pre,B,C,D,&S)` to create a
1048matrix that applies the action of $S$ (using `A_pre` to solve with `A`), but
1049does not assemble.
1050
1051Alternatively, if you already have a block matrix `M = [A, B; C, D]` (in some
1052ordering), then you can create index sets (`IS`) `isa` and `isb` to address each
1053block, then use `MatGetSchurComplement()` to create the Schur complement and/or an
1054approximation suitable for preconditioning.
1055
1056Since $S$ is generally dense, standard preconditioning methods cannot typically be
1057applied directly to Schur complements. There are many approaches to preconditioning Schur
1058complements including using the `SIMPLE` approximation
1059
1060$$
1061D - C \text{diag}(A)^{-1} B
1062$$
1063
1064to create a sparse matrix that approximates the Schur complement (this is returned by
1065default for the optional "preconditioning" matrix in `MatGetSchurComplement()`).
1066
1067An alternative is to interpret the matrices as differential operators and apply
1068approximate commutator arguments to find a spectrally equivalent operation that can be
1069applied efficiently (see the "PCD" preconditioners {cite}`elman_silvester_wathen_2014`). A
1070variant of this is the least squares commutator, which is closely related to the
1071Moore-Penrose pseudoinverse, and is available in `PCLSC` which operates on matrices of
1072type `MATSCHURCOMPLEMENT`.
1073
1074### Do you have examples of doing unstructured grid Finite Element Method (FEM) with PETSc?
1075
1076There are at least three ways to write finite element codes using PETSc:
1077
10781. Use `DMPLEX`, which is a high level approach to manage your mesh and
1079   discretization. See the {ref}`tutorials sections <tut_stokes>` for further information,
1080   or see `src/snes/tutorial/ex62.c`.
10812. Use packages such as [deal.ii](https://www.dealii.org), [libMesh](https://libmesh.github.io), or
1082   [Firedrake](https://www.firedrakeproject.org), which use PETSc for the solvers.
10833. Manage the grid data structure yourself and use PETSc `PetscSF`, `IS`, and `VecScatter` to
1084   communicate the required ghost point communication. See
1085   `src/snes/tutorials/ex10d/ex10.c`.
1086
1087### DMDA decomposes the domain differently than the MPI_Cart_create() command. How can one use them together?
1088
1089The `MPI_Cart_create()` first divides the mesh along the z direction, then the y, then
1090the x. `DMDA` divides along the x, then y, then z. Thus, for example, rank 1 of the
1091processes will be in a different part of the mesh for the two schemes. To resolve this you
1092can create a new MPI communicator that you pass to `DMDACreate()` that renumbers the
1093process ranks so that each physical process shares the same part of the mesh with both the
1094`DMDA` and the `MPI_Cart_create()`. The code to determine the new numbering was
1095provided by Rolf Kuiper:
1096
1097```
1098// the numbers of processors per direction are (int) x_procs, y_procs, z_procs respectively
1099// (no parallelization in direction 'dir' means dir_procs = 1)
1100
1101MPI_Comm    NewComm;
1102int         x, y, z;
1103PetscMPIInt MPI_Rank, NewRank;
1104
1105// get rank from MPI ordering:
1106PetscCallMPI(MPI_Comm_rank(MPI_COMM_WORLD, &MPI_Rank));
1107
1108// calculate coordinates of cpus in MPI ordering:
1109x = MPI_rank / (z_procs*y_procs);
1110y = (MPI_rank % (z_procs*y_procs)) / z_procs;
1111z = (MPI_rank % (z_procs*y_procs)) % z_procs;
1112
1113// set new rank according to PETSc ordering:
1114NewRank = z*y_procs*x_procs + y*x_procs + x;
1115
1116// create communicator with new ranks according to PETSc ordering
1117PetscCallMPI(MPI_Comm_split(PETSC_COMM_WORLD, 1, NewRank, &NewComm));
1118
1119// override the default communicator (was MPI_COMM_WORLD as default)
1120PETSC_COMM_WORLD = NewComm;
1121```
1122
1123### When solving a system with Dirichlet boundary conditions I can use MatZeroRows() to eliminate the Dirichlet rows but this results in a non-Symmetric system. How can I apply Dirichlet boundary conditions but keep the matrix symmetric?
1124
1125- For nonsymmetric systems put the appropriate boundary solutions in the x vector and use
1126  `MatZeroRows()` followed by `KSPSetOperators()`.
1127- For symmetric problems use `MatZeroRowsColumns()`.
1128- If you have many Dirichlet locations you can use `MatZeroRows()` (**not**
1129  `MatZeroRowsColumns()`) and `-ksp_type preonly -pc_type redistribute` (see
1130  `PCREDISTRIBUTE`) and PETSc will repartition the parallel matrix for load
1131  balancing. In this case the new matrix solved remains symmetric even though
1132  `MatZeroRows()` is used.
1133
1134An alternative approach is, when assembling the matrix (generating values and passing
1135them to the matrix), never to include locations for the Dirichlet grid points in the
1136vector and matrix, instead taking them into account as you put the other values into the
1137load.
1138
1139### How can I get PETSc vectors and matrices to MATLAB or vice versa?
1140
1141There are numerous ways to work with PETSc and MATLAB. All but the first approach
1142require PETSc to be configured with --with-matlab.
1143
11441. To save PETSc `Mat` and `Vec` to files that can be read from MATLAB use
1145   `PetscViewerBinaryOpen()` viewer and `VecView()` or `MatView()` to save objects
1146   for MATLAB and `VecLoad()` and `MatLoad()` to get the objects that MATLAB has
1147   saved. See `share/petsc/matlab/PetscBinaryRead.m` and
1148   `share/petsc/matlab/PetscBinaryWrite.m` for loading and saving the objects in MATLAB.
11492. Using the [MATLAB Engine](https://www.mathworks.com/help/matlab/calling-matlab-engine-from-c-programs-1.html),
1150   allows PETSc to automatically call MATLAB to perform some specific computations. This
1151   does not allow MATLAB to be used interactively by the user. See the
1152   `PetscMatlabEngine`.
11533. You can open a socket connection between MATLAB and PETSc to allow sending objects back
1154   and forth between an interactive MATLAB session and a running PETSc program. See
1155   `PetscViewerSocketOpen()` for access from the PETSc side and
1156   `share/petsc/matlab/PetscReadBinary.m` for access from the MATLAB side.
11574. You can save PETSc `Vec` (**not** `Mat`) with the `PetscViewerMatlabOpen()`
1158   viewer that saves `.mat` files can then be loaded into MATLAB using the `load()` command
1159
1160### How do I get started with Cython so that I can extend petsc4py?
1161
11621. Learn how to [build a Cython module](http://docs.cython.org/src/quickstart/build.html).
11632. Go through the simple [example](https://stackoverflow.com/questions/3046305/simple-wrapping-of-c-code-with-cython). Note
1164   also the next comment that shows how to create numpy arrays in the Cython and pass them
1165   back.
11663. Check out [this page](http://docs.cython.org/src/tutorial/numpy.html) which tells
1167   you how to get fast indexing.
11684. Have a look at the petsc4py [array source](http://code.google.com/p/petsc4py/source/browse/src/PETSc/arraynpy.pxi).
1169
1170### How do I compute a custom norm for KSP to use as a convergence test or for monitoring?
1171
1172You need to call `KSPBuildResidual()` on your `KSP` object and then compute the
1173appropriate norm on the resulting residual. Note that depending on the
1174`KSPSetNormType()` of the method you may not return the same norm as provided by the
1175method. See also `KSPSetPCSide()`.
1176
1177### If I have a sequential program can I use a PETSc parallel solver?
1178
1179:::{important}
1180Do not expect to get great speedups! Much of the speedup gained by using parallel
1181solvers comes from building the underlying matrices and vectors in parallel to begin
1182with. You should see some reduction in the time for the linear solvers.
1183:::
1184
1185Yes, you must set up PETSc with MPI (even though you will not use MPI) with at least the
1186following options:
1187
1188```console
1189$ ./configure --download-superlu_dist --download-parmetis --download-metis --with-openmp
1190```
1191
1192Your compiler must support OpenMP. To have the linear solver run in parallel run your
1193program with
1194
1195```console
1196$ OMP_NUM_THREADS=n ./myprog -pc_type lu -pc_factor_mat_solver superlu_dist
1197```
1198
1199where `n` is the number of threads and should be less than or equal to the number of cores
1200available.
1201
1202:::{note}
1203If your code is MPI parallel you can also use these same options to have SuperLU_dist
1204utilize multiple threads per MPI process for the direct solver. Make sure that the
1205`$OMP_NUM_THREADS` you use per MPI process is less than or equal to the number of
1206cores available for each MPI process. For example if your compute nodes have 6 cores
1207and you use 2 MPI processes per node then set `$OMP_NUM_THREADS` to 2 or 3.
1208:::
1209
1210Another approach that allows using a PETSc parallel solver is to use `PCMPI`.
1211
1212### TS or SNES produces infeasible (out of domain) solutions or states. How can I prevent this?
1213
1214For `TS` call `TSSetFunctionDomainError()`. For both `TS` and `SNES` call
1215`SNESSetFunctionDomainError()` when the solver passes an infeasible (out of domain)
1216solution or state to your routines.
1217
1218If it occurs for DAEs, it is important to insure the algebraic constraints are well
1219satisfied, which can prevent "breakdown" later. Thus, one can try using a tight tolerance
1220for `SNES`, using a direct linear solver (`PCType` of `PCLU`) when possible, and reducing the timestep (or
1221tightening `TS` tolerances for adaptive time stepping).
1222
1223### Can PETSc work with Hermitian matrices?
1224
1225PETSc's support of Hermitian matrices is limited. Many operations and solvers work
1226for symmetric (`MATSBAIJ`) matrices and operations on transpose matrices but there is
1227little direct support for Hermitian matrices and Hermitian transpose (complex conjugate
1228transpose) operations. There is `KSPSolveTranspose()` for solving the transpose of a
1229linear system but no `KSPSolveHermitian()`.
1230
1231For creating known Hermitian matrices:
1232
1233- `MatCreateNormalHermitian()`
1234- `MatCreateHermitianTranspose()`
1235
1236For determining or setting Hermitian status on existing matrices:
1237
1238- `MatIsHermitian()`
1239- `MatIsHermitianKnown()`
1240- `MatIsStructurallySymmetric()`
1241- `MatIsSymmetricKnown()`
1242- `MatIsSymmetric()`
1243- `MatSetOption()` (use with `MAT_SYMMETRIC` or `MAT_HERMITIAN` to assert to PETSc
1244  that either is the case).
1245
1246For performing matrix operations on known Hermitian matrices (note that regular `Mat`
1247functions such as `MatMult()` will of course also work on Hermitian matrices):
1248
1249- `MatMultHermitianTranspose()`
1250- `MatMultHermitianTransposeAdd()` (very limited support)
1251
1252### How can I assemble a bunch of similar matrices?
1253
1254You can first add the values common to all the matrices, then use `MatStoreValues()` to
1255stash the common values. Each iteration you call `MatRetrieveValues()`, then set the
1256unique values in a matrix and assemble.
1257
1258### Can one resize or change the size of PETSc matrices or vectors?
1259
1260No, once the vector or matrices sizes have been set and the matrices or vectors are fully
1261usable one cannot change the size of the matrices or vectors or number of processors they
1262live on. One may create new vectors and copy, for example using `VecScatterCreate()`,
1263the old values from the previous vector.
1264
1265### How can one compute the nullspace of a sparse matrix with MUMPS?
1266
1267Assuming you have an existing matrix $A$ whose nullspace $V$ you want to find:
1268
1269```
1270Mat      F, work, V;
1271PetscInt N, rows;
1272
1273/* Determine factorability */
1274PetscCall(MatGetFactor(A, MATSOLVERMUMPS, MAT_FACTOR_LU, &F));
1275PetscCall(MatGetLocalSize(A, &rows, NULL));
1276
1277/* Set MUMPS options, see MUMPS documentation for more information */
1278PetscCall(MatMumpsSetIcntl(F, 24, 1));
1279PetscCall(MatMumpsSetIcntl(F, 25, 1));
1280
1281/* Perform factorization */
1282PetscCall(MatLUFactorSymbolic(F, A, NULL, NULL, NULL));
1283PetscCall(MatLUFactorNumeric(F, A, NULL));
1284
1285/* This is the dimension of the null space */
1286PetscCall(MatMumpsGetInfog(F, 28, &N));
1287
1288/* This will contain the null space in the columns */
1289PetscCall(MatCreateDense(comm, rows, N, PETSC_DETERMINE, PETSC_DETERMINE, NULL, &V));
1290
1291PetscCall(MatDuplicate(V, MAT_DO_NOT_COPY_VALUES, &work));
1292PetscCall(MatMatSolve(F, work, V));
1293```
1294
1295______________________________________________________________________
1296
1297## Execution
1298
1299### PETSc executables are SO big and take SO long to link
1300
1301:::{note}
1302See {ref}`shared libraries section <doc_faq_sharedlibs>` for more information.
1303:::
1304
1305We find this annoying as well. On most machines PETSc can use shared libraries, so
1306executables should be much smaller, run `configure` with the additional option
1307`--with-shared-libraries` (this is the default). Also, if you have room, compiling and
1308linking PETSc on your machine's `/tmp` disk or similar local disk, rather than over the
1309network will be much faster.
1310
1311### How does PETSc's `-help` option work? Why is it different for different programs?
1312
1313There are 2 ways in which one interacts with the options database:
1314
1315- `PetscOptionsGetXXX()` where `XXX` is some type or data structure (for example
1316  `PetscOptionsGetBool()` or `PetscOptionsGetScalarArray()`). This is a classic
1317  "getter" function, which queries the command line options for a matching option name,
1318  and returns the specified value.
1319- `PetscOptionsXXX()` where `XXX` is some type or data structure (for example
1320  `PetscOptionsBool()` or `PetscOptionsScalarArray()`). This is a so-called "provider"
1321  function. It first records the option name in an internal list of previously encountered
1322  options before calling `PetscOptionsGetXXX()` to query the status of said option.
1323
1324While users generally use the first option, developers will *always* use the second
1325(provider) variant of functions. Thus, as the program runs, it will build up a list of
1326encountered option names which are then printed **in the order of their appearance on the
1327root rank**. Different programs may take different paths through PETSc source code, so
1328they will encounter different providers, and therefore have different `-help` output.
1329
1330### PETSc has so many options for my program that it is hard to keep them straight
1331
1332Running the PETSc program with the option `-help` will print out many of the options. To
1333print the options that have been specified within a program, employ `-options_left` to
1334print any options that the user specified but were not actually used by the program and
1335all options used; this is helpful for detecting typo errors. The PETSc website has a search option,
1336in the upper right hand corner, that quickly finds answers to most PETSc questions.
1337
1338### PETSc automatically handles many of the details in parallel PDE solvers. How can I understand what is really happening within my program?
1339
1340You can use the option `-info` to get more details about the solution process. The
1341option `-log_view` provides details about the distribution of time spent in the various
1342phases of the solution process. You can run with `-ts_view` or `-snes_view` or
1343`-ksp_view` to see what solver options are being used. Run with `-ts_monitor`,
1344`-snes_monitor`, or `-ksp_monitor` to watch convergence of the
1345methods. `-snes_converged_reason` and `-ksp_converged_reason` will indicate why and if
1346the solvers have converged.
1347
1348### Assembling large sparse matrices takes a long time. What can I do to make this process faster? Or MatSetValues() is so slow; what can I do to speed it up?
1349
1350You probably need to do preallocation, as explained in {any}`sec_matsparse`.
1351See also the {ref}`performance chapter <ch_performance>` of the users manual.
1352
1353For GPUs (and even CPUs) you can use `MatSetPreallocationCOO()` and `MatSetValuesCOO()` for more rapid assembly.
1354
1355### How can I generate performance summaries with PETSc?
1356
1357Use this option at runtime:
1358
1359-l
1360
1361og_view
1362
1363Outputs a comprehensive timing, memory consumption, and communications digest
1364for your program. See the {ref}`profiling chapter <ch_profiling>` of the users
1365manual for information on interpreting the summary data.
1366
1367### How do I know the amount of time spent on each level of the multigrid solver/preconditioner?
1368
1369Run with `-log_view` and `-pc_mg_log`
1370
1371### Where do I get the input matrices for the examples?
1372
1373Some examples use `$DATAFILESPATH/matrices/medium` and other files. These test matrices
1374in PETSc binary format can be found in the [datafiles repository](https://gitlab.com/petsc/datafiles).
1375
1376### When I dump some matrices and vectors to binary, I seem to be generating some empty files with `.info` extensions. What's the deal with these?
1377
1378PETSc binary viewers put some additional information into `.info` files like matrix
1379block size. It is harmless but if you *really* don't like it you can use
1380`-viewer_binary_skip_info` or `PetscViewerBinarySkipInfo()`.
1381
1382:::{note}
1383You need to call `PetscViewerBinarySkipInfo()` before
1384`PetscViewerFileSetName()`. In other words you **cannot** use
1385`PetscViewerBinaryOpen()` directly.
1386:::
1387
1388### Why is my parallel solver slower than my sequential solver, or I have poor speed-up?
1389
1390This can happen for many reasons:
1391
13921. Make sure it is truly the time in `KSPSolve()` that is slower (by running the code
1393   with `-log_view`). Often the slower time is in generating the matrix or some other
1394   operation.
13952. There must be enough work for each process to outweigh the communication time. We
1396   recommend an absolute minimum of about 10,000 unknowns per process, better is 20,000 or
1397   more. This is even more true when using multiple GPUs, where you need to have millions
1398   of unknowns per GPU.
13993. Make sure the {ref}`communication speed of the parallel computer
1400   <doc_faq_general_parallel>` is good enough for parallel solvers.
14014. Check the number of solver iterates with the parallel solver against the sequential
1402   solver. Most preconditioners require more iterations when used on more processes, this
1403   is particularly true for block Jacobi (the default parallel preconditioner). You can
1404   try `-pc_type asm` (`PCASM`) its iterations scale a bit better for more
1405   processes. You may also consider multigrid preconditioners like `PCMG` or BoomerAMG
1406   in `PCHYPRE`.
1407
1408(doc_faq_pipelined)=
1409
1410### What steps are necessary to make the pipelined solvers execute efficiently?
1411
1412Pipelined solvers like `KSPPGMRES`, `KSPPIPECG`, `KSPPIPECR`, and `KSPGROPPCG` may
1413require special MPI configuration to effectively overlap reductions with computation. In
1414general, this requires an MPI-3 implementation, an implementation that supports multiple
1415threads, and use of a "progress thread". Consult the documentation from your vendor or
1416computing facility for more details.
1417
1418- Cray MPI MPT-5.6 MPI-3, by setting `$MPICH_MAX_THREAD_SAFETY` to "multiple"
1419  for threads, plus either `$MPICH_ASYNC_PROGRESS` or
1420  `$MPICH_NEMESIS_ASYNC_PROGRESS`. E.g.
1421  ```console
1422  $ export MPICH_MAX_THREAD_SAFETY=multiple
1423  $ export MPICH_ASYNC_PROGRESS=1
1424  $ export MPICH_NEMESIS_ASYNC_PROGRESS=1
1425  ```
1426
1427- MPICH version 3.0 and later implements the MPI-3 standard and the default
1428  configuration supports use of threads. Use of a progress thread is configured by
1429  setting the environment variable `$MPICH_ASYNC_PROGRESS`. E.g.
1430  ```console
1431  $ export MPICH_ASYNC_PROGRESS=1
1432  ```
1433
1434### When using PETSc in single precision mode (`--with-precision=single` when running `configure`) are the operations done in single or double precision?
1435
1436PETSc does **NOT** do any explicit conversion of single precision to double before
1437performing computations; it depends on the hardware and compiler for what happens. For
1438example, the compiler could choose to put the single precision numbers into the usual
1439double precision registers and then use the usual double precision floating point unit. Or
1440it could use SSE2 instructions that work directly on the single precision numbers. It is a
1441bit of a mystery what decisions get made sometimes. There may be compiler flags in some
1442circumstances that can affect this.
1443
1444### Why is Newton's method (SNES) not converging, or converges slowly?
1445
1446Newton's method may not converge for many reasons, here are some of the most common:
1447
1448- The Jacobian is wrong (or correct in sequential but not in parallel).
1449- The linear system is {ref}`not solved <doc_faq_execution_kspconv>` or is not solved
1450  accurately enough.
1451- The Jacobian system has a singularity that the linear solver is not handling.
1452- There is a bug in the function evaluation routine.
1453- The function is not continuous or does not have continuous first derivatives (e.g. phase
1454  change or TVD limiters).
1455- The equations may not have a solution (e.g. limit cycle instead of a steady state) or
1456  there may be a "hill" between the initial guess and the steady state (e.g. reactants
1457  must ignite and burn before reaching a steady state, but the steady-state residual will
1458  be larger during combustion).
1459
1460Here are some of the ways to help debug lack of convergence of Newton:
1461
1462- Run on one processor to see if the problem is only in parallel.
1463
1464- Run with `-info` to get more detailed information on the solution process.
1465
1466- Run with the options
1467
1468  ```text
1469  -snes_monitor -ksp_monitor_true_residual -snes_converged_reason -ksp_converged_reason
1470  ```
1471
1472  - If the linear solve does not converge, check if the Jacobian is correct, then see
1473    {ref}`this question <doc_faq_execution_kspconv>`.
1474  - If the preconditioned residual converges, but the true residual does not, the
1475    preconditioner may be singular.
1476  - If the linear solve converges well, but the line search fails, the Jacobian may be
1477    incorrect.
1478
1479- Run with `-pc_type lu` or `-pc_type svd` to see if the problem is a poor linear
1480  solver.
1481
1482- Run with `-mat_view` or `-mat_view draw` to see if the Jacobian looks reasonable.
1483
1484- Run with `-snes_test_jacobian -snes_test_jacobian_view` to see if the Jacobian you are
1485  using is wrong. Compare the output when you add `-mat_fd_type ds` to see if the result
1486  is sensitive to the choice of differencing parameter.
1487
1488- Run with `-snes_mf_operator -pc_type lu` to see if the Jacobian you are using is
1489  wrong. If the problem is too large for a direct solve, try
1490
1491  ```text
1492  -snes_mf_operator -pc_type ksp -ksp_ksp_rtol 1e-12.
1493  ```
1494
1495  Compare the output when you add `-mat_mffd_type ds` to see if the result is sensitive
1496  to choice of differencing parameter.
1497
1498- Run with `-snes_linesearch_monitor` to see if the line search is failing (this is
1499  usually a sign of a bad Jacobian). Use `-info` in PETSc 3.1 and older versions,
1500  `-snes_ls_monitor` in PETSc 3.2 and `-snes_linesearch_monitor` in PETSc 3.3 and
1501  later.
1502
1503Here are some ways to help the Newton process if everything above checks out:
1504
1505- Run with grid sequencing (`-snes_grid_sequence` if working with a `DM` is all you
1506  need) to generate better initial guess on your finer mesh.
1507
1508- Run with quad precision, i.e.
1509
1510  ```console
1511  $ ./configure --with-precision=__float128 --download-f2cblaslapack
1512  ```
1513
1514  :::{note}
1515  quad precision requires PETSc 3.2 and later and recent versions of the GNU compilers.
1516  :::
1517
1518- Change the units (nondimensionalization), boundary condition scaling, or formulation so
1519  that the Jacobian is better conditioned. See [Buckingham pi theorem](https://en.wikipedia.org/wiki/Buckingham_%CF%80_theorem) and [Dimensional and
1520  Scaling Analysis](https://epubs.siam.org/doi/pdf/10.1137/16M1107127).
1521
1522- Mollify features in the function that do not have continuous first derivatives (often
1523  occurs when there are "if" statements in the residual evaluation, e.g. phase change or
1524  TVD limiters). Use a variational inequality solver (`SNESVINEWTONRSLS`) if the
1525  discontinuities are of fundamental importance.
1526
1527- Try a trust region method (`-ts_type tr`, may have to adjust parameters).
1528
1529- Run with some continuation parameter from a point where you know the solution, see
1530  `TSPSEUDO` for steady-states.
1531
1532- There are homotopy solver packages like PHCpack that can get you all possible solutions
1533  (and tell you that it has found them all) but those are not scalable and cannot solve
1534  anything but small problems.
1535
1536(doc_faq_execution_kspconv)=
1537
1538### Why is the linear solver (KSP) not converging, or converges slowly?
1539
1540:::{tip}
1541Always run with `-ksp_converged_reason -ksp_monitor_true_residual` when trying to
1542learn why a method is not converging!
1543:::
1544
1545Common reasons for KSP not converging are:
1546
1547- A symmetric method is being used for a non-symmetric problem.
1548
1549- The equations are singular by accident (e.g. forgot to impose boundary
1550  conditions). Check this for a small problem using `-pc_type svd -pc_svd_monitor`.
1551
1552- The equations are intentionally singular (e.g. constant null space), but the Krylov
1553  method was not informed, see `MatSetNullSpace()`. Always inform your local Krylov
1554  subspace solver of any change of singularity. Failure to do so will result in the
1555  immediate revocation of your computing and keyboard operator licenses, as well as
1556  a stern talking-to by the nearest Krylov Subspace Method representative.
1557
1558- The equations are intentionally singular and `MatSetNullSpace()` was used, but the
1559  right-hand side is not consistent. You may have to call `MatNullSpaceRemove()` on the
1560  right-hand side before calling `KSPSolve()`. See `MatSetTransposeNullSpace()`.
1561
1562- The equations are indefinite so that standard preconditioners don't work. Usually you
1563  will know this from the physics, but you can check with
1564
1565  ```text
1566  -ksp_compute_eigenvalues -ksp_gmres_restart 1000 -pc_type none
1567  ```
1568
1569  For simple saddle point problems, try
1570
1571  ```text
1572  -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_detect_saddle_point
1573  ```
1574
1575  For more difficult problems, read the literature to find robust methods and ask
1576  <mailto:petsc-users@mcs.anl.gov> or <mailto:petsc-maint@mcs.anl.gov> if you want advice about how to
1577  implement them.
1578
1579- If the method converges in preconditioned residual, but not in true residual, the
1580  preconditioner is likely singular or nearly so. This is common for saddle point problems
1581  (e.g. incompressible flow) or strongly nonsymmetric operators (e.g. low-Mach hyperbolic
1582  problems with large time steps).
1583
1584- The preconditioner is too weak or is unstable. See if `-pc_type asm -sub_pc_type lu`
1585  improves the convergence rate. If GMRES is losing too much progress in the restart, see
1586  if longer restarts help `-ksp_gmres_restart 300`. If a transpose is available, try
1587  `-ksp_type bcgs` or other methods that do not require a restart.
1588
1589  :::{note}
1590  Unfortunately convergence with these methods is frequently erratic.
1591  :::
1592
1593- The preconditioner is nonlinear (e.g. a nested iterative solve), try `-ksp_type
1594  fgmres` or `-ksp_type gcr`.
1595
1596- You are using geometric multigrid, but some equations (often boundary conditions) are
1597  not scaled compatibly between levels. Try `-pc_mg_galerkin` both to algebraically
1598  construct a correctly scaled coarse operator or make sure that all the equations are
1599  scaled in the same way if you want to use rediscretized coarse levels.
1600
1601- The matrix is very ill-conditioned. Check the {ref}`condition number <doc_faq_usage_condnum>`.
1602
1603  - Try to improve it by choosing the relative scaling of components/boundary conditions.
1604  - Try `-ksp_diagonal_scale -ksp_diagonal_scale_fix`.
1605  - Perhaps change the formulation of the problem to produce more friendly algebraic
1606    equations.
1607
1608- Change the units (nondimensionalization), boundary condition scaling, or formulation so
1609  that the Jacobian is better conditioned. See [Buckingham pi theorem](https://en.wikipedia.org/wiki/Buckingham_%CF%80_theorem) and [Dimensional and
1610  Scaling Analysis](https://epubs.siam.org/doi/pdf/10.1137/16M1107127).
1611
1612- Classical Gram-Schmidt is becoming unstable, try `-ksp_gmres_modifiedgramschmidt` or
1613  use a method that orthogonalizes differently, e.g. `-ksp_type gcr`.
1614
1615### I get the error message: Actual argument at (1) to assumed-type dummy is of derived type with type-bound or FINAL procedures
1616
1617Use the following code-snippet:
1618
1619```fortran
1620module context_module
1621#include petsc/finclude/petsc.h
1622use petsc
1623implicit none
1624private
1625type, public ::  context_type
1626  private
1627  PetscInt :: foo
1628contains
1629  procedure, public :: init => context_init
1630end type context_type
1631contains
1632subroutine context_init(self, foo)
1633  class(context_type), intent(in out) :: self
1634  PetscInt, intent(in) :: foo
1635  self%foo = foo
1636end subroutine context_init
1637end module context_module
1638
1639!------------------------------------------------------------------------
1640
1641program test_snes
1642use,intrinsic :: iso_c_binding
1643use petsc
1644use context_module
1645implicit none
1646
1647SNES :: snes
1648type(context_type),target :: context
1649type(c_ptr) :: contextOut
1650PetscErrorCode :: ierr
1651
1652call PetscInitialize(PETSC_NULL_CHARACTER, ierr)
1653call SNESCreate(PETSC_COMM_WORLD, snes, ierr)
1654call context%init(1)
1655
1656contextOut = c_loc(context) ! contextOut is a C pointer on context
1657
1658call SNESSetConvergenceTest(snes, convergence, contextOut, PETSC_NULL_FUNCTION, ierr)
1659
1660call SNESDestroy(snes, ierr)
1661call PetscFinalize(ierr)
1662
1663contains
1664
1665subroutine convergence(snes, num_iterations, xnorm, pnorm,fnorm, reason, contextIn, ierr)
1666SNES, intent(in) :: snes
1667
1668PetscInt, intent(in) :: num_iterations
1669PetscReal, intent(in) :: xnorm, pnorm, fnorm
1670SNESConvergedReason, intent(out) :: reason
1671type(c_ptr), intent(in out) :: contextIn
1672type(context_type), pointer :: context
1673PetscErrorCode, intent(out) :: ierr
1674
1675call c_f_pointer(contextIn,context)  ! convert the C pointer to a Fortran pointer to use context as in the main program
1676reason = 0
1677ierr = 0
1678end subroutine convergence
1679end program test_snes
1680```
1681
1682### In C++ I get a crash on `VecDestroy()` (or some other PETSc object) at the end of the program
1683
1684This can happen when the destructor for a C++ class is automatically called at the end of
1685the program after `PetscFinalize()`. Use the following code-snippet:
1686
1687```
1688main()
1689{
1690  PetscCall(PetscInitialize());
1691  {
1692    your variables
1693    your code
1694
1695    ...   /* all your destructors are called here automatically by C++ so they work correctly */
1696  }
1697  PetscCall(PetscFinalize());
1698  return 0
1699}
1700```
1701
1702______________________________________________________________________
1703
1704## Debugging
1705
1706### What does the message hwloc/linux: Ignoring PCI device with non-16bit domain mean?
1707
1708This is printed by the hwloc library that is used by some MPI implementations. It can be ignored.
1709To prevent the message from always being printed set the environmental variable `HWLOC_HIDE_ERRORS` to 2.
1710For example
1711
1712```
1713export HWLOC_HIDE_ERRORS=2
1714```
1715
1716which can be added to your login profile file such as `~/.bashrc`.
1717
1718### How do I turn off PETSc signal handling so I can use the `-C` Option On `xlf`?
1719
1720Immediately after calling `PetscInitialize()` call `PetscPopSignalHandler()`.
1721
1722Some Fortran compilers including the IBM xlf, xlF etc compilers have a compile option
1723(`-C` for IBM's) that causes all array access in Fortran to be checked that they are
1724in-bounds. This is a great feature but does require that the array dimensions be set
1725explicitly, not with a \*.
1726
1727### How do I debug if `-start_in_debugger` does not work on my machine?
1728
1729The script <https://github.com/Azrael3000/tmpi> can be used to run multiple MPI
1730ranks in the debugger using tmux.
1731
1732On newer macOS machines - one has to be in admin group to be able to use debugger.
1733
1734On newer Ubuntu linux machines - one has to disable `ptrace_scope` with
1735
1736```console
1737$ sudo echo 0 > /proc/sys/kernel/yama/ptrace_scope
1738```
1739
1740to get start in debugger working.
1741
1742If `-start_in_debugger` does not work on your OS, for a uniprocessor job, just
1743try the debugger directly, for example: `gdb ex1`. You can also use [TotalView](https://totalview.io/products/totalview) which is a good graphical parallel debugger.
1744
1745### How do I see where my code is hanging?
1746
1747You can use the `-start_in_debugger` option to start all processes in the debugger (each
1748will come up in its own xterm) or run in [TotalView](https://totalview.io/products/totalview). Then use `cont` (for continue) in each
1749xterm. Once you are sure that the program is hanging, hit control-c in each xterm and then
1750use 'where' to print a stack trace for each process.
1751
1752### How can I inspect PETSc vector and matrix values when in the debugger?
1753
1754I will illustrate this with `gdb`, but it should be similar on other debuggers. You can
1755look at local `Vec` values directly by obtaining the array. For a `Vec` v, we can
1756print all local values using:
1757
1758```console
1759(gdb) p ((Vec_Seq*) v->data)->array[0]@v->map.n
1760```
1761
1762However, this becomes much more complicated for a matrix. Therefore, it is advisable to use the default viewer to look at the object. For a `Vec` v and a `Mat` m, this would be:
1763
1764```console
1765(gdb) call VecView(v, 0)
1766(gdb) call MatView(m, 0)
1767```
1768
1769or with a communicator other than `MPI_COMM_WORLD`:
1770
1771```console
1772(gdb) call MatView(m, PETSC_VIEWER_STDOUT_(m->comm))
1773```
1774
1775Totalview 8.8.0+ has a new feature that allows libraries to provide their own code to
1776display objects in the debugger. Thus in theory each PETSc object, `Vec`, `Mat` etc
1777could have custom code to print values in the object. We have only done this for the most
1778elementary display of `Vec` and `Mat`. See the routine `TV_display_type()` in
1779`src/vec/vec/interface/vector.c` for an example of how these may be written. Contact us
1780if you would like to add more.
1781
1782### How can I find the cause of floating point exceptions like not-a-number (NaN) or infinity?
1783
1784The best way to locate floating point exceptions is to use a debugger. On supported
1785architectures (including Linux and glibc-based systems), just run in a debugger and pass
1786`-fp_trap` to the PETSc application. This will activate signaling exceptions and the
1787debugger will break on the line that first divides by zero or otherwise generates an
1788exceptions.
1789
1790Without a debugger, running with `-fp_trap` in debug mode will only identify the
1791function in which the error occurred, but not the line or the type of exception. If
1792`-fp_trap` is not supported on your architecture, consult the documentation for your
1793debugger since there is likely a way to have it catch exceptions.
1794
1795(error_libimf)=
1796
1797### Error while loading shared libraries: libimf.so: cannot open shared object file: No such file or directory
1798
1799The Intel compilers use shared libraries (like libimf) that cannot be found, by default, at run
1800time. When using the Intel compilers (and running the resulting code) you must make sure
1801that the proper Intel initialization scripts are run. This is usually done by adding some
1802code into your `.cshrc`, `.bashrc`, `.profile` etc file. Sometimes on batch file
1803systems that do now access your initialization files (like .cshrc) you must include the
1804initialization calls in your batch file submission.
1805
1806For example, on my Mac using `csh` I have the following in my `.cshrc` file:
1807
1808```csh
1809source /opt/intel/cc/10.1.012/bin/iccvars.csh
1810source /opt/intel/fc/10.1.012/bin/ifortvars.csh
1811source /opt/intel/idb/10.1.012/bin/idbvars.csh
1812```
1813
1814And in my `.profile` I have
1815
1816```csh
1817source /opt/intel/cc/10.1.012/bin/iccvars.sh
1818source /opt/intel/fc/10.1.012/bin/ifortvars.sh
1819source /opt/intel/idb/10.1.012/bin/idbvars.sh
1820```
1821
1822(object_type_not_set)=
1823
1824### What does "Object Type Not Set: Argument # N" Mean?
1825
1826Many operations on PETSc objects require that the specific type of the object be set before the operations is performed. You must call `XXXSetType()` or `XXXSetFromOptions()` before you make the offending call. For example
1827
1828```
1829Mat A;
1830
1831PetscCall(MatCreate(PETSC_COMM_WORLD, &A));
1832PetscCall(MatSetValues(A,....));
1833```
1834
1835will not work. You must add `MatSetType()` or `MatSetFromOptions()` before the call to `MatSetValues()`. I.e.
1836
1837```
1838Mat A;
1839
1840PetscCall(MatCreate(PETSC_COMM_WORLD, &A));
1841
1842PetscCall(MatSetType(A, MATAIJ));
1843/* Will override MatSetType() */
1844PetscCall(MatSetFromOptions());
1845
1846PetscCall(MatSetValues(A,....));
1847```
1848
1849(split_ownership)=
1850
1851### What does error detected in `PetscSplitOwnership()` about "sum of local lengths ...": mean?
1852
1853In a previous call to `VecSetSizes()`, `MatSetSizes()`, `VecCreateXXX()` or
1854`MatCreateXXX()` you passed in local and global sizes that do not make sense for the
1855correct number of processors. For example if you pass in a local size of 2 and a global
1856size of 100 and run on two processors, this cannot work since the sum of the local sizes
1857is 4, not 100.
1858
1859(valgrind)=
1860
1861### What does "corrupt argument" or "caught signal" Or "SEGV" Or "segmentation violation" Or "bus error" mean? Can I use Valgrind or CUDA-Memcheck to debug memory corruption issues?
1862
1863Sometimes it can mean an argument to a function is invalid. In Fortran this may be caused
1864by forgetting to list an argument in the call, especially the final `PetscErrorCode`.
1865
1866Otherwise it is usually caused by memory corruption; that is somewhere the code is writing
1867out of array bounds. To track this down rerun the debug version of the code with the
1868option `-malloc_debug`. Occasionally the code may crash only with the optimized version,
1869in that case run the optimized version with `-malloc_debug`. If you determine the
1870problem is from memory corruption you can put the macro CHKMEMQ in the code near the crash
1871to determine exactly what line is causing the problem.
1872
1873If `-malloc_debug` does not help: on NVIDIA CUDA systems you can use <https://docs.nvidia.com/cuda/cuda-memcheck/index.html>
1874
1875If `-malloc_debug` does not help: on GNU/Linux (not macOS machines) - you can
1876use [valgrind](http://valgrind.org). Follow the below instructions:
1877
18781. `configure` PETSc with `--download-mpich --with-debugging` (you can use other MPI implementations but most produce spurious Valgrind messages)
1879
18802. Compile your application code with this build of PETSc.
1881
18823. Run with Valgrind.
1883
1884   ```console
1885   $ $PETSC_DIR/lib/petsc/bin/petscmpiexec -valgrind -n NPROC PETSCPROGRAMNAME PROGRAMOPTIONS
1886   ```
1887
1888   or
1889
1890   ```console
1891   $ mpiexec -n NPROC valgrind --tool=memcheck -q --num-callers=20 \
1892   --suppressions=$PETSC_DIR/share/petsc/suppressions/valgrind \
1893   --log-file=valgrind.log.%p PETSCPROGRAMNAME -malloc off PROGRAMOPTIONS
1894   ```
1895
1896:::{note}
1897- option `--with-debugging` enables valgrind to give stack trace with additional
1898  source-file:line-number info.
1899- option `--download-mpich` is valgrind clean, other MPI builds are not valgrind clean.
1900- when `--download-mpich` is used - mpiexec will be in `$PETSC_ARCH/bin`
1901- `--log-file=valgrind.log.%p` option tells valgrind to store the output from each
1902  process in a different file \[as %p i.e PID, is different for each MPI process.
1903- `memcheck` will not find certain array access that violate static array
1904  declarations so if memcheck runs clean you can try the `--tool=exp-ptrcheck`
1905  instead.
1906:::
1907
1908You might also consider using <http://drmemory.org> which has support for GNU/Linux, Apple
1909Mac OS and Microsoft Windows machines. (Note we haven't tried this ourselves).
1910
1911(zeropivot)=
1912
1913### What does "detected zero pivot in LU factorization" mean?
1914
1915A zero pivot in LU, ILU, Cholesky, or ICC sparse factorization does not always mean that
1916the matrix is singular. You can use
1917
1918```text
1919-pc_factor_shift_type nonzero -pc_factor_shift_amount [amount]
1920```
1921
1922or
1923
1924```text
1925-pc_factor_shift_type positive_definite -[level]_pc_factor_shift_type nonzero
1926 -pc_factor_shift_amount [amount]
1927```
1928
1929or
1930
1931```text
1932-[level]_pc_factor_shift_type positive_definite
1933```
1934
1935to prevent the zero pivot. [level] is "sub" when lu, ilu, cholesky, or icc are employed in
1936each individual block of the bjacobi or ASM preconditioner. [level] is "mg_levels" or
1937"mg_coarse" when lu, ilu, cholesky, or icc are used inside multigrid smoothers or to the
1938coarse grid solver. See `PCFactorSetShiftType()`, `PCFactorSetShiftAmount()`.
1939
1940This error can also happen if your matrix is singular, see `MatSetNullSpace()` for how
1941to handle this. If this error occurs in the zeroth row of the matrix, it is likely you
1942have an error in the code that generates the matrix.
1943
1944### You create draw windows or `PETSCVIEWERDRAW` windows or use options `-ksp_monitor draw::draw_lg` or `-snes_monitor draw::draw_lg` and the program seems to run OK but windows never open
1945
1946The libraries were compiled without support for X windows. Make sure that `configure`
1947was run with the option `--with-x`.
1948
1949### The program seems to use more and more memory as it runs, even though you don't think you are allocating more memory
1950
1951Some of the following may be occurring:
1952
1953- You are creating new PETSc objects but never freeing them.
1954- There is a memory leak in PETSc or your code.
1955- Something much more subtle: (if you are using Fortran). When you declare a large array
1956  in Fortran, the operating system does not allocate all the memory pages for that array
1957  until you start using the different locations in the array. Thus, in a code, if at each
1958  step you start using later values in the array your virtual memory usage will "continue"
1959  to increase as measured by `ps` or `top`.
1960- You are running with the `-log`, `-log_mpe`, or `-log_all` option. With these
1961  options, a great deal of logging information is stored in memory until the conclusion of
1962  the run.
1963- You are linking with the MPI profiling libraries; these cause logging of all MPI
1964  activities. Another symptom is at the conclusion of the run it may print some message
1965  about writing log files.
1966
1967The following may help:
1968
1969- Run with the `-malloc_debug` option and `-malloc_view`. Or use `PetscMallocDump()`
1970  and `PetscMallocView()` sprinkled about your code to track memory that is allocated
1971  and not later freed. Use the commands `PetscMallocGetCurrentUsage()` and
1972  `PetscMemoryGetCurrentUsage()` to monitor memory allocated and
1973  `PetscMallocGetMaximumUsage()` and `PetscMemoryGetMaximumUsage()` for total memory
1974  used as the code progresses.
1975- This is just the way Unix works and is harmless.
1976- Do not use the `-log`, `-log_mpe`, or `-log_all` option, or use
1977  `PetscLogEventDeactivate()` or `PetscLogEventDeactivateClass()` to turn off logging of
1978  specific events.
1979- Make sure you do not link with the MPI profiling libraries.
1980
1981### When calling `MatPartitioningApply()` you get a message "Error! Key 16615 Not Found"
1982
1983The graph of the matrix you are using is not symmetric. You must use symmetric matrices
1984for partitioning.
1985
1986### With GMRES at restart the second residual norm printed does not match the first
1987
1988I.e.
1989
1990```text
199126 KSP Residual norm 3.421544615851e-04
199227 KSP Residual norm 2.973675659493e-04
199328 KSP Residual norm 2.588642948270e-04
199429 KSP Residual norm 2.268190747349e-04
199530 KSP Residual norm 1.977245964368e-04
199630 KSP Residual norm 1.994426291979e-04 <----- At restart the residual norm is printed a second time
1997```
1998
1999This is actually not surprising! GMRES computes the norm of the residual at each iteration
2000via a recurrence relation between the norms of the residuals at the previous iterations
2001and quantities computed at the current iteration. It does not compute it via directly
2002$|| b - A x^{n} ||$.
2003
2004Sometimes, especially with an ill-conditioned matrix, or computation of the matrix-vector
2005product via differencing, the residual norms computed by GMRES start to "drift" from the
2006correct values. At the restart, we compute the residual norm directly, hence the "strange
2007stuff," the difference printed. The drifting, if it remains small, is harmless (doesn't
2008affect the accuracy of the solution that GMRES computes).
2009
2010If you use a more powerful preconditioner the drift will often be smaller and less
2011noticeable. Of if you are running matrix-free you may need to tune the matrix-free
2012parameters.
2013
2014### Why do some Krylov methods seem to print two residual norms per iteration?
2015
2016I.e.
2017
2018```text
20191198 KSP Residual norm 1.366052062216e-04
20201198 KSP Residual norm 1.931875025549e-04
20211199 KSP Residual norm 1.366026406067e-04
20221199 KSP Residual norm 1.931819426344e-04
2023```
2024
2025Some Krylov methods, for example `KSPTFQMR`, actually have a "sub-iteration" of size 2
2026inside the loop. Each of the two substeps has its own matrix vector product and
2027application of the preconditioner and updates the residual approximations. This is why you
2028get this "funny" output where it looks like there are two residual norms per
2029iteration. You can also think of it as twice as many iterations.
2030
2031### Unable to locate PETSc dynamic library `libpetsc`
2032
2033When using DYNAMIC libraries - the libraries cannot be moved after they are
2034installed. This could also happen on clusters - where the paths are different on the (run)
2035nodes - than on the (compile) front-end. **Do not use dynamic libraries & shared
2036libraries**. Run `configure` with
2037`--with-shared-libraries=0 --with-dynamic-loading=0`.
2038
2039:::{important}
2040This option has been removed in PETSc v3.5
2041:::
2042
2043### How do I determine what update to PETSc broke my code?
2044
2045If at some point (in PETSc code history) you had a working code - but the latest PETSc
2046code broke it, its possible to determine the PETSc code change that might have caused this
2047behavior. This is achieved by:
2048
2049- Using Git to access PETSc sources
2050- Knowing the Git commit for the known working version of PETSc
2051- Knowing the Git commit for the known broken version of PETSc
2052- Using the [bisect](https://mirrors.edge.kernel.org/pub/software/scm/git/docs/git-bisect.html)
2053  functionality of Git
2054
2055Git bisect can be done as follows:
2056
20571. Get PETSc development (main branch in git) sources
2058
2059   ```console
2060   $ git clone https://gitlab.com/petsc/petsc.git
2061   ```
2062
20632. Find the good and bad markers to start the bisection process. This can be done either
2064   by checking `git log` or `gitk` or <https://gitlab.com/petsc/petsc> or the web
2065   history of petsc-release clones. Lets say the known bad commit is 21af4baa815c and
2066   known good commit is 5ae5ab319844.
2067
20683. Start the bisection process with these known revisions. Build PETSc, and test your code
2069   to confirm known good/bad behavior:
2070
2071   ```console
2072   $ git bisect start 21af4baa815c 5ae5ab319844
2073   ```
2074
2075   build/test, perhaps discover that this new state is bad
2076
2077   ```console
2078   $ git bisect bad
2079   ```
2080
2081   build/test, perhaps discover that this state is good
2082
2083   ```console
2084   $ git bisect good
2085   ```
2086
2087   Now until done - keep bisecting, building PETSc, and testing your code with it and
2088   determine if the code is working or not. After something like 5-15 iterations, `git
2089   bisect` will pin-point the exact code change that resulted in the difference in
2090   application behavior.
2091
2092:::{tip}
2093See [git-bisect(1)](https://mirrors.edge.kernel.org/pub/software/scm/git/docs/git-bisect.html) and the
2094[debugging section of the Git Book](https://git-scm.com/book/en/Git-Tools-Debugging-with-Git) for more debugging tips.
2095:::
2096
2097### How to fix the error "PMIX Error: error in file gds_ds12_lock_pthread.c"?
2098
2099This seems to be an error when using Open MPI and OpenBLAS with threads (or perhaps other
2100packages that use threads).
2101
2102```console
2103$ export PMIX_MCA_gds=hash
2104```
2105
2106Should resolve the problem.
2107
2108______________________________________________________________________
2109
2110(doc_faq_sharedlibs)=
2111
2112## Shared Libraries
2113
2114### Can I install PETSc libraries as shared libraries?
2115
2116Yes. Use
2117
2118```console
2119$ ./configure --with-shared-libraries
2120```
2121
2122### Why should I use shared libraries?
2123
2124When you link to shared libraries, the function symbols from the shared libraries are not
2125copied in the executable. This way the size of the executable is considerably smaller than
2126when using regular libraries. This helps in a couple of ways:
2127
2128- Saves disk space when more than one executable is created
2129- Improves the compile time immensely, because the compiler has to write a much smaller
2130  file (executable) to the disk.
2131
2132### How do I link to the PETSc shared libraries?
2133
2134By default, the compiler should pick up the shared libraries instead of the regular
2135ones. Nothing special should be done for this.
2136
2137### What if I want to link to the regular `.a` library files?
2138
2139You must run `configure` with the option `--with-shared-libraries=0` (you can use a
2140different `$PETSC_ARCH` for this build so you can easily switch between the two).
2141
2142### What do I do if I want to move my executable to a different machine?
2143
2144You would also need to have access to the shared libraries on this new machine. The other
2145alternative is to build the executable without shared libraries by first deleting the
2146shared libraries, and then creating the executable.
2147
2148```{bibliography} /petsc.bib
2149:filter: docname in docnames
2150```
2151