xref: /petsc/doc/faq/index.md (revision ede9db9363e1fdaaa09befd664c8164883ccce80)
1(doc_faq)=
2
3# FAQ
4
5```{contents} Table Of Contents
6:backlinks: top
7:local: true
8```
9
10______________________________________________________________________
11
12## General
13
14### How can I subscribe to the PETSc mailing lists?
15
16See mailing list {ref}`documentation <doc_mail>`
17
18### Any useful books on numerical computing?
19
20[Bueler, PETSc for Partial Differential Equations: Numerical Solutions in C and Python](https://my.siam.org/Store/Product/viewproduct/?ProductId=32850137)
21
22[Oliveira and Stewart, Writing Scientific Software: A Guide to Good Style](https://www.cambridge.org/core/books/writing-scientific-software/23206704175AF868E43FE3FB399C2F53)
23
24(doc_faq_general_parallel)=
25
26### What kind of parallel computers or clusters are needed to use PETSc? Or why do I get little speedup?
27
28:::{important}
29PETSc can be used with any kind of parallel system that supports MPI BUT for any decent
30performance one needs:
31
32- Fast, **low-latency** interconnect; any ethernet (even 10 GigE) simply cannot provide
33  the needed performance.
34- High per-core **memory** performance. Each core needs to
35  have its **own** memory bandwidth of at least 2 or more gigabytes/second. Most modern
36  computers are not bottlenecked by how fast they can perform
37  calculations; rather, they are usually restricted by how quickly they can get their
38  data.
39:::
40
41To obtain good performance it is important that you know your machine! I.e. how many
42compute nodes (generally, how many motherboards), how many memory sockets per node and how
43many cores per memory socket and how much memory bandwidth for each.
44
45If you do not know this and can run MPI programs with mpiexec (that is, you don't have
46batch system), run the following from `$PETSC_DIR`:
47
48```console
49$ make streams [NPMAX=maximum_number_of_mpi_processes_you_plan_to_use]
50```
51
52This will provide a summary of the bandwidth with different number of MPI
53processes and potential speedups. See {any}`ch_streams` for a detailed discussion.
54
55If you have a batch system:
56
57```console
58$ cd $PETSC_DIR/src/benchmarks/streams
59$ make MPIVersion
60submit MPIVersion to the batch system a number of times with 1, 2, 3, etc. MPI processes
61collecting all of the output from the runs into the single file scaling.log. Copy
62scaling.log into the src/benchmarks/streams directory.
63$ ./process.py createfile ; process.py
64```
65
66Even if you have enough memory bandwidth if the OS switches processes between cores
67performance can degrade. Smart process to core/socket binding (this just means locking a
68process to a particular core or memory socket) may help you. For example, consider using
69fewer processes than cores and binding processes to separate sockets so that each process
70uses a different memory bus:
71
72- [MPICH2 binding with the Hydra process manager](https://github.com/pmodels/mpich/blob/main/doc/wiki/how_to/Using_the_Hydra_Process_Manager.md#process-core-binding]
73
74  ```console
75  $ mpiexec.hydra -n 4 --binding cpu:sockets
76  ```
77
78- [Open MPI binding](https://www.open-mpi.org/faq/?category=tuning#using-paffinity)
79
80  ```console
81  $ mpiexec -n 4 --map-by socket --bind-to socket --report-bindings
82  ```
83
84- `taskset`, part of the [util-linux](https://github.com/karelzak/util-linux) package
85
86  Check `man taskset` for details. Make sure to set affinity for **your** program,
87  **not** for the `mpiexec` program.
88
89- `numactl`
90
91  In addition to task affinity, this tool also allows changing the default memory affinity
92  policy. On Linux, the default policy is to attempt to find memory on the same memory bus
93  that serves the core that a thread is running on when the memory is faulted
94  (not when `malloc()` is called). If local memory is not available, it is found
95  elsewhere, possibly leading to serious memory imbalances.
96
97  The option `--localalloc` allocates memory on the local NUMA node, similar to the
98  `numa_alloc_local()` function in the `libnuma` library. The option
99  `--cpunodebind=nodes` binds the process to a given NUMA node (note that this can be
100  larger or smaller than a CPU (socket); a NUMA node usually has multiple cores).
101
102  The option `--physcpubind=cpus` binds the process to a given processor core (numbered
103  according to `/proc/cpuinfo`, therefore including logical cores if Hyper-threading is
104  enabled).
105
106  With Open MPI, you can use knowledge of the NUMA hierarchy and core numbering on your
107  machine to calculate the correct NUMA node or processor number given the environment
108  variable `OMPI_COMM_WORLD_LOCAL_RANK`. In most cases, it is easier to make mpiexec or
109  a resource manager set affinities.
110
111### What kind of license is PETSc released under?
112
113See licensing {ref}`documentation <doc_license>`
114
115### Why is PETSc written in C, instead of Fortran or C++?
116
117When this decision was made, in the early 1990s, C enabled us to build data structures
118for storing sparse matrices, solver information,
119etc. in ways that Fortran simply did not allow. ANSI C was a complete standard that all
120modern C compilers supported. The language was identical on all machines. C++ was still
121evolving and compilers on different machines were not identical. Using C function pointers
122to provide data encapsulation and polymorphism allowed us to get many of the advantages of
123C++ without using such a large and more complicated language. It would have been natural and
124reasonable to have coded PETSc in C++; we opted to use C instead.
125
126### Does all the PETSc error checking and logging reduce PETSc's efficiency?
127
128No
129
130(doc_faq_maintenance_strats)=
131
132### How do such a small group of people manage to write and maintain such a large and marvelous package as PETSc?
133
1341. **We work very efficiently**.
135
136   - We use powerful editors and programming environments.
137   - Our manual pages are generated automatically from formatted comments in the code,
138     thus alleviating the need for creating and maintaining manual pages.
139   - We employ continuous integration testing of the entire PETSc library on many different
140     machine architectures. This process **significantly** protects (no bug-catching
141     process is perfect) against inadvertently introducing bugs with new additions. Every
142     new feature **must** pass our suite of thousands of tests as well as formal code
143     review before it may be included.
144
1452. **We are very careful in our design (and are constantly revising our design)**
146
147   - PETSc as a package should be easy to use, write, and maintain. Our mantra is to write
148     code like everyone is using it.
149
1503. **We are willing to do the grunt work**
151
152   - PETSc is regularly checked to make sure that all code conforms to our interface
153     design. We will never keep in a bad design decision simply because changing it will
154     require a lot of editing; we do a lot of editing.
155
1564. **We constantly seek out and experiment with new design ideas**
157
158   - We retain the useful ones and discard the rest. All of these decisions are based not
159     just on performance, but also on **practicality**.
160
1615. **Function and variable names must adhere to strict guidelines**
162
163   - Even the rules about capitalization are designed to make it easy to figure out the
164     name of a particular object or routine. Our memories are terrible, so careful
165     consistent naming puts less stress on our limited human RAM.
166
1676. **The PETSc directory tree is designed to make it easy to move throughout the
168   entire package**
169
1707. **We have a rich, robust, and fast bug reporting system**
171
172   - <mailto:petsc-maint@mcs.anl.gov> is always checked, and we pride ourselves on responding
173     quickly and accurately. Email is very lightweight, and so bug reports system retains
174     an archive of all reported problems and fixes, so it is easy to re-find fixes to
175     previously discovered problems.
176
1778. **We contain the complexity of PETSc by using powerful object-oriented programming
178   techniques**
179
180   - Data encapsulation serves to abstract complex data formats or movement to
181     human-readable format. This is why your program cannot, for example, look directly
182     at what is inside the object `Mat`.
183   - Polymorphism makes changing program behavior as easy as possible, and further
184     abstracts the *intent* of your program from what is *written* in code. You call
185     `MatMult()` regardless of whether your matrix is dense, sparse, parallel or
186     sequential; you don't call a different routine for each format.
187
1889. **We try to provide the functionality requested by our users**
189
190### For complex numbers will I get better performance with C++?
191
192To use PETSc with complex numbers you may use the following `configure` options;
193`--with-scalar-type=complex` and either `--with-clanguage=c++` or (the default)
194`--with-clanguage=c`. In our experience they will deliver very similar performance
195(speed), but if one is concerned they should just try both and see if one is faster.
196
197### How come when I run the same program on the same number of processes I get a "different" answer?
198
199Inner products and norms in PETSc are computed using the `MPI_Allreduce()` command. For
200different runs the order at which values arrive at a given process (via MPI) can be in a
201different order, thus the order in which some floating point arithmetic operations are
202performed will be different. Since floating point arithmetic is not
203associative, the computed quantity may be slightly different.
204
205Over a run the many slight differences in the inner products and norms will effect all the
206computed results. It is important to realize that none of the computed answers are any
207less right or wrong (in fact the sequential computation is no more right then the parallel
208ones). All answers are equal, but some are more equal than others.
209
210The discussion above assumes that the exact same algorithm is being used on the different
211number of processes. When the algorithm is different for the different number of processes
212(almost all preconditioner algorithms except Jacobi are different for different number of
213processes) then one expects to see (and does) a greater difference in results for
214different numbers of processes. In some cases (for example block Jacobi preconditioner) it
215may be that the algorithm works for some number of processes and does not work for others.
216
217### How come when I run the same linear solver on a different number of processes it takes a different number of iterations?
218
219The convergence of many of the preconditioners in PETSc including the default parallel
220preconditioner block Jacobi depends on the number of processes. The more processes the
221(slightly) slower convergence it has. This is the nature of iterative solvers, the more
222parallelism means the more "older" information is used in the solution process hence
223slower convergence.
224
225(doc_faq_gpuhowto)=
226
227### Can PETSc use GPUs to speed up computations?
228
229:::{seealso}
230See GPU development {ref}`roadmap <doc_gpu_roadmap>` for the latest information
231regarding the state of PETSc GPU integration.
232
233See GPU install {ref}`documentation <doc_config_accel>` for up-to-date information on
234installing PETSc to use GPU's.
235:::
236
237Quick summary of usage with CUDA:
238
239- The `VecType` `VECSEQCUDA`, `VECMPICUDA`, or `VECCUDA` may be used with
240  `VecSetType()` or `-vec_type seqcuda`, `mpicuda`, or `cuda` when
241  `VecSetFromOptions()` is used.
242- The `MatType` `MATSEQAIJCUSPARSE`, `MATMPIAIJCUSPARSE`, or `MATAIJCUSPARSE`
243  may be used with `MatSetType()` or `-mat_type seqaijcusparse`, `mpiaijcusparse`, or
244  `aijcusparse` when `MatSetFromOptions()` is used.
245- If you are creating the vectors and matrices with a `DM`, you can use `-dm_vec_type
246  cuda` and `-dm_mat_type aijcusparse`.
247
248Quick summary of usage with OpenCL (provided by the ViennaCL library):
249
250- The `VecType` `VECSEQVIENNACL`, `VECMPIVIENNACL`, or `VECVIENNACL` may be used
251  with `VecSetType()` or `-vec_type seqviennacl`, `mpiviennacl`, or `viennacl`
252  when `VecSetFromOptions()` is used.
253- The `MatType` `MATSEQAIJVIENNACL`, `MATMPIAIJVIENNACL`, or `MATAIJVIENNACL`
254  may be used with `MatSetType()` or `-mat_type seqaijviennacl`, `mpiaijviennacl`, or
255  `aijviennacl` when `MatSetFromOptions()` is used.
256- If you are creating the vectors and matrices with a `DM`, you can use `-dm_vec_type
257  viennacl` and `-dm_mat_type aijviennacl`.
258
259General hints:
260
261- It is useful to develop your code with the default vectors and then run production runs
262  with the command line options to use the GPU since debugging on GPUs is difficult.
263- All of the Krylov methods except `KSPIBCGS` run on the GPU.
264- Parts of most preconditioners run directly on the GPU. After setup, `PCGAMG` runs
265  fully on GPUs, without any memory copies between the CPU and GPU.
266
267Some GPU systems (for example many laptops) only run with single precision; thus, PETSc
268must be built with the `configure` option `--with-precision=single`.
269
270(doc_faq_extendedprecision)=
271
272### Can I run PETSc with extended precision?
273
274Yes, with gcc and gfortran. `configure` PETSc using the
275options `--with-precision=__float128` and `--download-f2cblaslapack`.
276
277:::{admonition} Warning
278:class: yellow
279
280External packages are not guaranteed to work in this mode!
281:::
282
283### Why doesn't PETSc use Qd to implement support for extended precision?
284
285We tried really hard but could not. The problem is that the QD c++ classes, though they
286try to, implement the built-in data types of `double` are not native types and cannot
287"just be used" in a general piece of numerical source code. Ratherm the code has to
288rewritten to live within the limitations of QD classes. However PETSc can be built to use
289quad precision, as detailed {ref}`here <doc_faq_extendedprecision>`.
290
291### How do I cite PETSc?
292
293Use {any}`these citations <doc_index_citing_petsc>`.
294
295______________________________________________________________________
296
297## Installation
298
299### How do I begin using PETSc if the software has already been completely built and installed by someone else?
300
301Assuming that the PETSc libraries have been successfully built for a particular
302architecture and level of optimization, a new user must merely:
303
3041. Set `PETSC_DIR` to the full path of the PETSc home
305   directory. This will be the location of the `configure` script, and usually called
306   "petsc" or some variation of that (for example, /home/username/petsc).
3072. Set `PETSC_ARCH`, which indicates the configuration on which PETSc will be
308   used. Note that `$PETSC_ARCH` is simply a name the installer used when installing
309   the libraries. There will exist a directory within `$PETSC_DIR` that is named after
310   its corresponding `$PETSC_ARCH`. There many be several on a single system, for
311   example "linux-c-debug" for the debug versions compiled by a C compiler or
312   "linux-c-opt" for the optimized version.
313
314:::{admonition} Still Stuck?
315See the {ref}`quick-start tutorial <tut_install>` for a step-by-step guide on
316installing PETSc, in case you have missed a step.
317
318See the users manual section on {ref}`getting started <sec_getting_started>`.
319:::
320
321### The PETSc distribution is SO Large. How can I reduce my disk space usage?
322
3231. The PETSc users manual is provided in PDF format at `$PETSC_DIR/manual.pdf`. You
324   can delete this.
3252. The PETSc test suite contains sample output for many of the examples. These are
326   contained in the PETSc directories `$PETSC_DIR/src/*/tutorials/output` and
327   `$PETSC_DIR/src/*/tests/output`. Once you have run the test examples, you may remove
328   all of these directories to save some disk space. You can locate the largest with
329   e.g. `find . -name output -type d | xargs du -sh | sort -hr` on a Unix-based system.
3303. The debugging versions of the libraries are larger than the optimized versions. In a
331   pinch you can work with the optimized version, although we bid you good luck in
332   finding bugs as it is much easier with the debug version.
333
334### I want to use PETSc only for uniprocessor programs. Must I still install and use a version of MPI?
335
336No, run `configure` with the option `--with-mpi=0`
337
338### Can I install PETSc to not use X windows (either under Unix or Microsoft Windows with GCC)?
339
340Yes. Run `configure` with the additional flag `--with-x=0`
341
342### Why do you use MPI?
343
344MPI is the message-passing standard. Because it is a standard, it will not frequently change over
345time; thus, we do not have to change PETSc every time the provider of the message-passing
346system decides to make an interface change. MPI was carefully designed by experts from
347industry, academia, and government labs to provide the highest quality performance and
348capability.
349
350For example, the careful design of communicators in MPI allows the easy nesting of
351different libraries; no other message-passing system provides this support. All of the
352major parallel computer vendors were involved in the design of MPI and have committed to
353providing quality implementations.
354
355In addition, since MPI is a standard, several different groups have already provided
356complete free implementations. Thus, one does not have to rely on the technical skills of
357one particular group to provide the message-passing libraries. Today, MPI is the only
358practical, portable approach to writing efficient parallel numerical software.
359
360(invalid_mpi_compilers)=
361
362### What do I do if my MPI compiler wrappers are invalid?
363
364Most MPI implementations provide compiler wrappers (such as `mpicc`) which give the
365include and link options necessary to use that version of MPI to the underlying compilers
366. Configuration will fail if these wrappers are either absent or broken in the MPI pointed to by
367`--with-mpi-dir`. You can rerun the configure with the additional option
368`--with-mpi-compilers=0`, which will try to auto-detect working compilers; however,
369these compilers may be incompatible with the particular MPI build. If this fix does not
370work, run with `--with-cc=[your_c_compiler]` where you know `your_c_compiler` works
371with this particular MPI, and likewise for C++ (`--with-cxx=[your_cxx_compiler]`) and Fortran
372(`--with-fc=[your_fortran_compiler]`).
373
374(bit_indices)=
375
376### When should/can I use the `configure` option `--with-64-bit-indices`?
377
378By default the type that PETSc uses to index into arrays and keep sizes of arrays is a
379`PetscInt` defined to be a 32-bit `int`. If your problem:
380
381- Involves more than 2^31 - 1 unknowns (around 2 billion).
382- Your matrix might contain more than 2^31 - 1 nonzeros on a single process.
383
384Then you need to use this option. Otherwise you will get strange crashes.
385
386This option can be used when you are using either 32 or 64-bit pointers. You do not
387need to use this option if you are using 64-bit pointers unless the two conditions above
388hold.
389
390### What if I get an internal compiler error?
391
392You can rebuild the offending file individually with a lower optimization level. **Then
393make sure to complain to the compiler vendor and file a bug report**. For example, if the
394compiler chokes on `src/mat/impls/baij/seq/baijsolvtrannat.c` you can run the following
395from `$PETSC_DIR`:
396
397```console
398$ make -f gmakefile PCC_FLAGS="-O1" $PETSC_ARCH/obj/src/mat/impls/baij/seq/baijsolvtrannat.o
399$ make all
400```
401
402### How do I enable Python bindings (petsc4py) with PETSc?
403
4041. Install [Cython](https://cython.org/).
4052. `configure` PETSc with the `--with-petsc4py=1` option.
4063. set `PYTHONPATH=$PETSC_DIR/$PETSC_ARCH/lib`
407
408(macos_gfortran)=
409
410### What Fortran compiler do you recommend on macOS?
411
412We recommend using [homebrew](https://brew.sh/) to install [gfortran](https://gcc.gnu.org/wiki/GFortran), see {any}`doc_macos_install`
413
414### How can I find the URL locations of the packages you install using `--download-PACKAGE`?
415
416```console
417$ grep "self.download " $PETSC_DIR/config/BuildSystem/config/packages/*.py
418```
419
420### How to fix the problem: PETSc was configured with one MPICH (or Open MPI) `mpi.h` version but now appears to be compiling using a different MPICH (or Open MPI) `mpi.h` version
421
422This happens for generally one of two reasons:
423
424- You previously ran `configure` with the option `--download-mpich` (or `--download-openmpi`)
425  but later ran `configure` to use a version of MPI already installed on the
426  machine. Solution:
427
428  ```console
429  $ rm -rf $PETSC_DIR/$PETSC_ARCH
430  $ ./configure --your-args
431  ```
432
433(mpi_network_misconfigure)=
434
435### What does it mean when `make check` hangs or errors on `PetscOptionsInsertFile()`?
436
437For example:
438
439```none
440Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes
441See https://petsc.org/release/faq/
442[0]PETSC ERROR: #1 PetscOptionsInsertFile() line 563 in /Users/barrysmith/Src/PETSc/src/sys/objects/options.c
443[0]PETSC ERROR: #2 PetscOptionsInsert() line 720 in /Users/barrysmith/Src/PETSc/src/sys/objects/options.c
444[0]PETSC ERROR: #3 PetscInitialize() line 828 in /Users/barrysmith/Src/PETSc/src/sys/objects/pinit.c
445```
446
447or
448
449```none
450$ make check
451Running check examples to verify correct installation
452Using PETSC_DIR=/Users/barrysmith/Src/petsc and PETSC_ARCH=arch-fix-mpiexec-hang-2-ranks
453C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
454PROGRAM SEEMS TO BE HANGING HERE
455```
456
457This usually occurs when network settings are misconfigured (perhaps due to VPN) resulting in a failure or hang in system call `gethostbyname()`.
458
459- Verify you are using the correct `mpiexec` for the MPI you have linked PETSc with.
460
461- If you have a VPN enabled on your machine, try turning it off and then running `make check` to
462  verify that it is not the VPN playing poorly with MPI.
463
464- If  ``ping `hostname` `` (`/sbin/ping` on macOS) fails or hangs do:
465
466  ```none
467  echo 127.0.0.1 `hostname` | sudo tee -a /etc/hosts
468  ```
469
470  and try `make check` again.
471
472- Try completely disconnecting your machine from the network and see if `make check` then works
473
474- Try the PETSc `configure` option `--download-mpich-device=ch3:nemesis` with `--download-mpich`.
475
476______________________________________________________________________
477
478## Usage
479
480### How can I redirect PETSc's `stdout` and `stderr` when programming with a GUI interface in Windows Developer Studio or to C++ streams?
481
482To overload just the error messages write your own `MyPrintError()` function that does
483whatever you want (including pop up windows etc) and use it like below.
484
485```c
486extern "C" {
487  int PASCAL WinMain(HINSTANCE,HINSTANCE,LPSTR,int);
488};
489
490#include <petscsys.h>
491#include <mpi.h>
492
493const char help[] = "Set up from main";
494
495int MyPrintError(const char error[], ...)
496{
497  printf("%s", error);
498  return 0;
499}
500
501int main(int ac, char *av[])
502{
503  char           buf[256];
504  HINSTANCE      inst;
505  PetscErrorCode ierr;
506
507  inst = (HINSTANCE)GetModuleHandle(NULL);
508  PetscErrorPrintf = MyPrintError;
509
510  buf[0] = 0;
511  for (int i = 1; i < ac; ++i) {
512    strcat(buf, av[i]);
513    strcat(buf, " ");
514  }
515
516  PetscCall(PetscInitialize(&ac, &av, NULL, help));
517
518  return WinMain(inst, NULL, buf, SW_SHOWNORMAL);
519}
520```
521
522Place this file in the project and compile with this preprocessor definitions:
523
524```
525WIN32
526_DEBUG
527_CONSOLE
528_MBCS
529USE_PETSC_LOG
530USE_PETSC_BOPT_g
531USE_PETSC_STACK
532_AFXDLL
533```
534
535And these link options:
536
537```
538/nologo
539/subsystem:console
540/incremental:yes
541/debug
542/machine:I386
543/nodefaultlib:"libcmtd.lib"
544/nodefaultlib:"libcd.lib"
545/nodefaultlib:"mvcrt.lib"
546/pdbtype:sept
547```
548
549:::{note}
550The above is compiled and linked as if it was a console program. The linker will search
551for a main, and then from it the `WinMain` will start. This works with MFC templates and
552derived classes too.
553
554When writing a Window's console application you do not need to do anything, the `stdout`
555and `stderr` is automatically output to the console window.
556:::
557
558To change where all PETSc `stdout` and `stderr` go, (you can also reassign
559`PetscVFPrintf()` to handle `stdout` and `stderr` any way you like) write the
560following function:
561
562```
563PetscErrorCode mypetscvfprintf(FILE *fd, const char format[], va_list Argp)
564{
565  PetscFunctionBegin;
566  if (fd != stdout && fd != stderr) { /* handle regular files */
567    PetscCall(PetscVFPrintfDefault(fd, format, Argp));
568  } else {
569    char buff[1024]; /* Make sure to assign a large enough buffer */
570    int  length;
571
572    PetscCall(PetscVSNPrintf(buff, 1024, format, &length, Argp));
573
574    /* now send buff to whatever stream or whatever you want */
575  }
576  PetscFunctionReturn(PETSC_SUCCESS);
577}
578```
579
580Then assign `PetscVFPrintf = mypetscprintf` before `PetscInitialize()` in your main
581program.
582
583### I want to use Hypre boomerAMG without GMRES but when I run `-pc_type hypre -pc_hypre_type boomeramg -ksp_type preonly` I don't get a very accurate answer!
584
585You should run with `-ksp_type richardson` to have PETSc run several V or W
586cycles. `-ksp_type preonly` causes boomerAMG to use only one V/W cycle. You can control
587how many cycles are used in a single application of the boomerAMG preconditioner with
588`-pc_hypre_boomeramg_max_iter <it>` (the default is 1). You can also control the
589tolerance boomerAMG uses to decide if to stop before `max_iter` with
590`-pc_hypre_boomeramg_tol <tol>` (the default is 1.e-7). Run with `-ksp_view` to see
591all the hypre options used and `-help | grep boomeramg` to see all the command line
592options.
593
594### How do I use PETSc for Domain Decomposition?
595
596PETSc includes Additive Schwarz methods in the suite of preconditioners under the umbrella
597of `PCASM`. These may be activated with the runtime option `-pc_type asm`. Various
598other options may be set, including the degree of overlap `-pc_asm_overlap <number>` the
599type of restriction/extension `-pc_asm_type [basic,restrict,interpolate,none]` sets ASM
600type and several others. You may see the available ASM options by using `-pc_type asm
601-help`. See the procedural interfaces in the manual pages, for example `PCASMType()`
602and check the index of the users manual for `PCASMCreateSubdomains()`.
603
604PETSc also contains a domain decomposition inspired wirebasket or face based two level
605method where the coarse mesh to fine mesh interpolation is defined by solving specific
606local subdomain problems. It currently only works for 3D scalar problems on structured
607grids created with PETSc `DMDA`. See the manual page for `PCEXOTIC` and
608`src/ksp/ksp/tutorials/ex45.c` for an example.
609
610PETSc also contains a balancing Neumann-Neumann type preconditioner, see the manual page
611for `PCBDDC`. This requires matrices be constructed with `MatCreateIS()` via the finite
612element method. See `src/ksp/ksp/tests/ex59.c` for an example.
613
614### You have `MATAIJ` and `MATBAIJ` matrix formats, and `MATSBAIJ` for symmetric storage, how come no `MATSAIJ`?
615
616Just for historical reasons; the `MATSBAIJ` format with blocksize one is just as efficient as
617a `MATSAIJ` would be.
618
619### Can I Create BAIJ matrices with different size blocks for different block rows?
620
621No. The `MATBAIJ` format only supports a single fixed block size on the entire
622matrix. But the `MATAIJ` format automatically searches for matching rows and thus still
623takes advantage of the natural blocks in your matrix to obtain good performance.
624
625:::{note}
626If you use `MATAIJ` you cannot use the `MatSetValuesBlocked()`.
627:::
628
629### How do I access the values of a remote parallel PETSc Vec?
630
6311. On each process create a local `Vec` large enough to hold all the values it wishes to
632   access.
6332. Create a `VecScatter` that scatters from the parallel `Vec` into the local `Vec`.
6343. Use `VecGetArray()` to access the values in the local `Vec`.
635
636For example, assuming we have distributed a vector `vecGlobal` of size $N$ to
637$R$ ranks and each remote rank holds $N/R = m$ values (similarly assume that
638$N$ is cleanly divisible by $R$). We want each rank $r$ to gather the
639first $n$ (also assume $n \leq m$) values from its immediately superior neighbor
640$r+1$ (final rank will retrieve from rank 0).
641
642```
643Vec            vecLocal;
644IS             isLocal, isGlobal;
645VecScatter     ctx;
646PetscScalar    *arr;
647PetscInt       N, firstGlobalIndex;
648MPI_Comm       comm;
649PetscMPIInt    r, R;
650
651/* Create sequential local vector, big enough to hold local portion */
652PetscCall(VecCreateSeq(PETSC_COMM_SELF, n, &vecLocal));
653
654/* Create IS to describe where we want to scatter to */
655PetscCall(ISCreateStride(PETSC_COMM_SELF, n, 0, 1, &isLocal));
656
657/* Compute the global indices */
658PetscCall(VecGetSize(vecGlobal, &N));
659PetscCall(PetscObjectGetComm((PetscObject) vecGlobal, &comm));
660PetscCallMPI(MPI_Comm_rank(comm, &r));
661PetscCallMPI(MPI_Comm_size(comm, &R));
662firstGlobalIndex = r == R-1 ? 0 : (N/R)*(r+1);
663
664/* Create IS that describes where we want to scatter from */
665PetscCall(ISCreateStride(comm, n, firstGlobalIndex, 1, &isGlobal));
666
667/* Create the VecScatter context */
668PetscCall(VecScatterCreate(vecGlobal, isGlobal, vecLocal, isLocal, &ctx));
669
670/* Gather the values */
671PetscCall(VecScatterBegin(ctx, vecGlobal, vecLocal, INSERT_VALUES, SCATTER_FORWARD));
672PetscCall(VecScatterEnd(ctx, vecGlobal, vecLocal, INSERT_VALUES, SCATTER_FORWARD));
673
674/* Retrieve and do work */
675PetscCall(VecGetArray(vecLocal, &arr));
676/* Work */
677PetscCall(VecRestoreArray(vecLocal, &arr));
678
679/* Don't forget to clean up */
680PetscCall(ISDestroy(&isLocal));
681PetscCall(ISDestroy(&isGlobal));
682PetscCall(VecScatterDestroy(&ctx));
683PetscCall(VecDestroy(&vecLocal));
684```
685
686(doc_faq_usage_alltoone)=
687
688### How do I collect to a single processor all the values from a parallel PETSc Vec?
689
6901. Create the `VecScatter` context that will do the communication:
691
692   ```
693   Vec        in_par,out_seq;
694   VecScatter ctx;
695
696   PetscCall(VecScatterCreateToAll(in_par, &ctx, &out_seq));
697   ```
698
6992. Initiate the communication (this may be repeated if you wish):
700
701   ```
702   PetscCall(VecScatterBegin(ctx, in_par, out_seq, INSERT_VALUES, SCATTER_FORWARD));
703   PetscCall(VecScatterEnd(ctx, in_par, out_seq, INSERT_VALUES, SCATTER_FORWARD));
704   /* May destroy context now if no additional scatters are needed, otherwise reuse it */
705   PetscCall(VecScatterDestroy(&ctx));
706   ```
707
708Note that this simply concatenates in the parallel ordering of the vector (computed by the
709`MPI_Comm_rank` of the owning process). If you are using a `Vec` from
710`DMCreateGlobalVector()` you likely want to first call `DMDAGlobalToNaturalBegin()`
711followed by `DMDAGlobalToNaturalEnd()` to scatter the original `Vec` into the natural
712ordering in a new global `Vec` before calling `VecScatterBegin()`/`VecScatterEnd()`
713to scatter the natural `Vec` onto all processes.
714
715### How do I collect all the values from a parallel PETSc Vec on the 0th rank?
716
717See FAQ entry on collecting to {ref}`an arbitrary processor <doc_faq_usage_alltoone>`, but
718replace
719
720```
721PetscCall(VecScatterCreateToAll(in_par, &ctx, &out_seq));
722```
723
724with
725
726```
727PetscCall(VecScatterCreateToZero(in_par, &ctx, &out_seq));
728```
729
730:::{note}
731The same ordering considerations as discussed in the aforementioned entry also apply
732here.
733:::
734
735### How can I read in or write out a sparse matrix in Matrix Market, Harwell-Boeing, Slapc or other ASCII format?
736
737If you can read or write your matrix using Python or MATLAB/Octave, `PetscBinaryIO`
738modules are provided at `$PETSC_DIR/lib/petsc/bin` for each language that can assist
739with reading and writing. If you just want to convert `MatrixMarket`, you can use:
740
741```console
742$ python -m $PETSC_DIR/lib/petsc/bin/PetscBinaryIO convert matrix.mtx
743```
744
745To produce `matrix.petsc`.
746
747You can also call the script directly or import it from your Python code. There is also a
748`PETScBinaryIO.jl` Julia package.
749
750For other formats, either adapt one of the above libraries or see the examples in
751`$PETSC_DIR/src/mat/tests`, specifically `ex72.c` or `ex78.c`. You will likely need
752to modify the code slightly to match your required ASCII format.
753
754:::{note}
755Never read or write in parallel an ASCII matrix file.
756
757Instead read in sequentially with a standalone code based on `ex72.c` or `ex78.c`
758then save the matrix with the binary viewer `PetscViewerBinaryOpen()` and load the
759matrix in parallel in your "real" PETSc program with `MatLoad()`.
760
761For writing save with the binary viewer and then load with the sequential code to store
762it as ASCII.
763:::
764
765### Does TSSetFromOptions(), SNESSetFromOptions(), or KSPSetFromOptions() reset all the parameters I previously set or how come do they not seem to work?
766
767If `XXSetFromOptions()` is used (with `-xxx_type aaaa`) to change the type of the
768object then all parameters associated with the previous type are removed. Otherwise it
769does not reset parameters.
770
771`TS/SNES/KSPSetXXX()` commands that set properties for a particular type of object (such
772as `KSPGMRESSetRestart()`) ONLY work if the object is ALREADY of that type. For example,
773with
774
775```
776KSP ksp;
777
778PetscCall(KSPCreate(PETSC_COMM_WORLD, &ksp));
779PetscCall(KSPGMRESSetRestart(ksp, 10));
780```
781
782the restart will be ignored since the type has not yet been set to `KSPGMRES`. To have
783those values take effect you should do one of the following:
784
785- Allow setting the type from the command line, if it is not on the command line then the
786  default type is automatically set.
787
788```
789/* Create generic object */
790XXXCreate(..,&obj);
791/* Must set all settings here, or default */
792XXXSetFromOptions(obj);
793```
794
795- Hardwire the type in the code, but allow the user to override it via a subsequent
796  `XXXSetFromOptions()` call. This essentially allows the user to customize what the
797  "default" type to of the object.
798
799```
800/* Create generic object */
801XXXCreate(..,&obj);
802/* Set type directly */
803XXXSetYYYYY(obj,...);
804/* Can always change to different type */
805XXXSetFromOptions(obj);
806```
807
808### How do I compile and link my own PETSc application codes and can I use my own `makefile` or rules for compiling code, rather than PETSc's?
809
810See the {ref}`section <sec_writing_application_codes>` of the users manual on writing
811application codes with PETSc.
812
813### Can I use CMake to build my own project that depends on PETSc?
814
815See the {ref}`section <sec_writing_application_codes>` of the users manual on writing
816application codes with PETSc.
817
818### How can I put carriage returns in `PetscPrintf()` statements from Fortran?
819
820You can use the same notation as in C, just put a `\n` in the string. Note that no other C
821format instruction is supported.
822
823Or you can use the Fortran concatenation `//` and `char(10)`; for example `'some
824string'//char(10)//'another string` on the next line.
825
826### How can I implement callbacks using C++ class methods?
827
828Declare the class method static. Static methods do not have a `this` pointer, but the
829`void*` context parameter will usually be cast to a pointer to the class where it can
830serve the same function.
831
832:::{admonition} Remember
833All PETSc callbacks return `PetscErrorCode`.
834:::
835
836### Everyone knows that when you code Newton's Method you should compute the function and its Jacobian at the same time. How can one do this in PETSc?
837
838The update in Newton's method is computed as
839
840$$
841u^{n+1} = u^n - \lambda * \left[J(u^n)] * F(u^n) \right]^{\dagger}
842$$
843
844The reason PETSc doesn't default to computing both the function and Jacobian at the same
845time is:
846
847- In order to do the line search $F \left(u^n - \lambda * \text{step} \right)$ may
848  need to be computed for several $\lambda$. The Jacobian is not needed for each of
849  those and one does not know in advance which will be the final $\lambda$ until
850  after the function value is computed, so many extra Jacobians may be computed.
851- In the final step if $|| F(u^p)||$ satisfies the convergence criteria then a
852  Jacobian need not be computed.
853
854You are free to have your `FormFunction()` compute as much of the Jacobian at that point
855as you like, keep the information in the user context (the final argument to
856`FormFunction()` and `FormJacobian()`) and then retrieve the information in your
857`FormJacobian()` function.
858
859### Computing the Jacobian or preconditioner is time consuming. Is there any way to compute it less often?
860
861PETSc has a variety of ways of lagging the computation of the Jacobian or the
862preconditioner. They are documented in the manual page for `SNESComputeJacobian()`
863and in the {ref}`users manual <ch_snes>`:
864
865-s
866
867nes_lag_jacobian
868
869(`SNESSetLagJacobian()`) How often Jacobian is rebuilt (use -1 to
870never rebuild, use -2 to rebuild the next time requested and then
871never again).
872
873-s
874
875nes_lag_jacobian_persists
876
877(`SNESSetLagJacobianPersists()`) Forces lagging of Jacobian
878through multiple `SNES` solves , same as passing -2 to
879`-snes_lag_jacobian`. By default, each new `SNES` solve
880normally triggers a recomputation of the Jacobian.
881
882-s
883
884nes_lag_preconditioner
885
886(`SNESSetLagPreconditioner()`) how often the preconditioner is
887rebuilt. Note: if you are lagging the Jacobian the system will
888know that the matrix has not changed and will not recompute the
889(same) preconditioner.
890
891-s
892
893nes_lag_preconditioner_persists
894
895(`SNESSetLagPreconditionerPersists()`) Preconditioner
896lags through multiple `SNES` solves
897
898:::{note}
899These are often (but does not need to be) used in combination with
900`-snes_mf_operator` which applies the fresh Jacobian matrix-free for every
901matrix-vector product. Otherwise the out-of-date matrix vector product, computed with
902the lagged Jacobian will be used.
903:::
904
905By using `KSPMonitorSet()` and/or `SNESMonitorSet()` one can provide code that monitors the
906convergence rate and automatically triggers an update of the Jacobian or preconditioner
907based on decreasing convergence of the iterative method. For example if the number of `SNES`
908iterations doubles one might trigger a new computation of the Jacobian. Experimentation is
909the only general purpose way to determine which approach is best for your problem.
910
911:::{important}
912It is also vital to experiment on your true problem at the scale you will be solving
913the problem since the performance benefits depend on the exact problem and the problem
914size!
915:::
916
917### How can I use Newton's Method Jacobian free? Can I difference a different function than provided with `SNESSetFunction()`?
918
919The simplest way is with the option `-snes_mf`, this will use finite differencing of the
920function provided to `SNESComputeFunction()` to approximate the action of Jacobian.
921
922:::{important}
923Since no matrix-representation of the Jacobian is provided the `-pc_type` used with
924this option must be `-pc_type none`. You may provide a custom preconditioner with
925`SNESGetKSP()`, `KSPGetPC()`, and `PCSetType()` and use `PCSHELL`.
926:::
927
928The option `-snes_mf_operator` will use Jacobian free to apply the Jacobian (in the
929Krylov solvers) but will use whatever matrix you provided with `SNESSetJacobian()`
930(assuming you set one) to compute the preconditioner.
931
932To write the code (rather than use the options above) use `MatCreateSNESMF()` and pass
933the resulting matrix object to `SNESSetJacobian()`.
934
935For purely matrix-free (like `-snes_mf`) pass the matrix object for both matrix
936arguments and pass the function `MatMFFDComputeJacobian()`.
937
938To provide your own approximate Jacobian matrix to compute the preconditioner (like
939`-snes_mf_operator`), pass this other matrix as the second matrix argument to
940`SNESSetJacobian()`. Make sure your provided `computejacobian()` function calls
941`MatAssemblyBegin()` and `MatAssemblyEnd()` separately on **BOTH** matrix arguments
942to this function. See `src/snes/tests/ex7.c`.
943
944To difference a different function than that passed to `SNESSetJacobian()` to compute the
945matrix-free Jacobian multiply call `MatMFFDSetFunction()` to set that other function. See
946`src/snes/tests/ex7.c.h`.
947
948(doc_faq_usage_condnum)=
949
950### How can I determine the condition number of a matrix?
951
952For small matrices, the condition number can be reliably computed using
953
954```text
955-pc_type svd -pc_svd_monitor
956```
957
958For larger matrices, you can run with
959
960```text
961-pc_type none -ksp_type gmres -ksp_monitor_singular_value -ksp_gmres_restart 1000
962```
963
964to get approximations to the condition number of the operator. This will generally be
965accurate for the largest singular values, but may overestimate the smallest singular value
966unless the method has converged. Make sure to avoid restarts. To estimate the condition
967number of the preconditioned operator, use `-pc_type somepc` in the last command.
968
969You can use [SLEPc](https://slepc.upv.es) for highly scalable, efficient, and quality eigenvalue computations.
970
971### How can I compute the inverse of a matrix in PETSc?
972
973:::{admonition} Are you sure?
974:class: yellow
975
976It is very expensive to compute the inverse of a matrix and very rarely needed in
977practice. We highly recommend avoiding algorithms that need it.
978:::
979
980The inverse of a matrix (dense or sparse) is essentially always dense, so begin by
981creating a dense matrix B and fill it with the identity matrix (ones along the diagonal),
982also create a dense matrix X of the same size that will hold the solution. Then factor the
983matrix you wish to invert with `MatLUFactor()` or `MatCholeskyFactor()`, call the
984result A. Then call `MatMatSolve(A,B,X)` to compute the inverse into X. See also section
985on {any}`Schur's complement <how_can_i_compute_the_schur_complement>`.
986
987(how_can_i_compute_the_schur_complement)=
988
989### How can I compute the Schur complement in PETSc?
990
991:::{admonition} Are you sure?
992:class: yellow
993
994It is very expensive to compute the Schur complement of a matrix and very rarely needed
995in practice. We highly recommend avoiding algorithms that need it.
996:::
997
998The Schur complement of the matrix $M \in \mathbb{R}^{\left(p+q \right) \times
999\left(p + q \right)}$
1000
1001$$
1002M = \begin{bmatrix}
1003A & B \\
1004C & D
1005\end{bmatrix}
1006$$
1007
1008where
1009
1010$$
1011A \in \mathbb{R}^{p \times p}, \quad B \in \mathbb{R}^{p \times q}, \quad C \in \mathbb{R}^{q \times p}, \quad D \in \mathbb{R}^{q \times q}
1012$$
1013
1014is given by
1015
1016$$
1017S_D := A - BD^{-1}C \\
1018S_A := D - CA^{-1}B
1019$$
1020
1021Like the inverse, the Schur complement of a matrix (dense or sparse) is essentially always
1022dense, so assuming you wish to calculate $S_A = D - C \underbrace{
1023\overbrace{(A^{-1})}^{U} B}_{V}$ begin by:
1024
10251. Forming a dense matrix $B$
10262. Also create another dense matrix $V$ of the same size.
10273. Then either factor the matrix $A$ directly with `MatLUFactor()` or
1028   `MatCholeskyFactor()`, or use `MatGetFactor()` followed by
1029   `MatLUFactorSymbolic()` followed by `MatLUFactorNumeric()` if you wish to use and
1030   external solver package like SuperLU_Dist. Call the result $U$.
10314. Then call `MatMatSolve(U,B,V)`.
10325. Then call `MatMatMult(C,V,MAT_INITIAL_MATRIX,1.0,&S)`.
10336. Now call `MatAXPY(S,-1.0,D,MAT_SUBSET_NONZERO)`.
10347. Followed by `MatScale(S,-1.0)`.
1035
1036For computing Schur complements like this it does not make sense to use the `KSP`
1037iterative solvers since for solving many moderate size problems using a direct
1038factorization is much faster than iterative solvers. As you can see, this requires a great
1039deal of work space and computation so is best avoided.
1040
1041However, it is not necessary to assemble the Schur complement $S$ in order to solve
1042systems with it. Use `MatCreateSchurComplement(A,A_pre,B,C,D,&S)` to create a
1043matrix that applies the action of $S$ (using `A_pre` to solve with `A`), but
1044does not assemble.
1045
1046Alternatively, if you already have a block matrix `M = [A, B; C, D]` (in some
1047ordering), then you can create index sets (`IS`) `isa` and `isb` to address each
1048block, then use `MatGetSchurComplement()` to create the Schur complement and/or an
1049approximation suitable for preconditioning.
1050
1051Since $S$ is generally dense, standard preconditioning methods cannot typically be
1052applied directly to Schur complements. There are many approaches to preconditioning Schur
1053complements including using the `SIMPLE` approximation
1054
1055$$
1056D - C \text{diag}(A)^{-1} B
1057$$
1058
1059to create a sparse matrix that approximates the Schur complement (this is returned by
1060default for the optional "preconditioning" matrix in `MatGetSchurComplement()`).
1061
1062An alternative is to interpret the matrices as differential operators and apply
1063approximate commutator arguments to find a spectrally equivalent operation that can be
1064applied efficiently (see the "PCD" preconditioners {cite}`elman_silvester_wathen_2014`). A
1065variant of this is the least squares commutator, which is closely related to the
1066Moore-Penrose pseudoinverse, and is available in `PCLSC` which operates on matrices of
1067type `MATSCHURCOMPLEMENT`.
1068
1069### Do you have examples of doing unstructured grid Finite Element Method (FEM) with PETSc?
1070
1071There are at least three ways to write finite element codes using PETSc:
1072
10731. Use `DMPLEX`, which is a high level approach to manage your mesh and
1074   discretization. See the {ref}`tutorials sections <tut_stokes>` for further information,
1075   or see `src/snes/tutorial/ex62.c`.
10762. Use packages such as [deal.ii](https://www.dealii.org), [libMesh](https://libmesh.github.io), or
1077   [Firedrake](https://www.firedrakeproject.org), which use PETSc for the solvers.
10783. Manage the grid data structure yourself and use PETSc `PetscSF`, `IS`, and `VecScatter` to
1079   communicate the required ghost point communication. See
1080   `src/snes/tutorials/ex10d/ex10.c`.
1081
1082### DMDA decomposes the domain differently than the MPI_Cart_create() command. How can one use them together?
1083
1084The `MPI_Cart_create()` first divides the mesh along the z direction, then the y, then
1085the x. `DMDA` divides along the x, then y, then z. Thus, for example, rank 1 of the
1086processes will be in a different part of the mesh for the two schemes. To resolve this you
1087can create a new MPI communicator that you pass to `DMDACreate()` that renumbers the
1088process ranks so that each physical process shares the same part of the mesh with both the
1089`DMDA` and the `MPI_Cart_create()`. The code to determine the new numbering was
1090provided by Rolf Kuiper:
1091
1092```
1093// the numbers of processors per direction are (int) x_procs, y_procs, z_procs respectively
1094// (no parallelization in direction 'dir' means dir_procs = 1)
1095
1096MPI_Comm    NewComm;
1097int         x, y, z;
1098PetscMPIInt MPI_Rank, NewRank;
1099
1100// get rank from MPI ordering:
1101PetscCallMPI(MPI_Comm_rank(MPI_COMM_WORLD, &MPI_Rank));
1102
1103// calculate coordinates of cpus in MPI ordering:
1104x = MPI_rank / (z_procs*y_procs);
1105y = (MPI_rank % (z_procs*y_procs)) / z_procs;
1106z = (MPI_rank % (z_procs*y_procs)) % z_procs;
1107
1108// set new rank according to PETSc ordering:
1109NewRank = z*y_procs*x_procs + y*x_procs + x;
1110
1111// create communicator with new ranks according to PETSc ordering
1112PetscCallMPI(MPI_Comm_split(PETSC_COMM_WORLD, 1, NewRank, &NewComm));
1113
1114// override the default communicator (was MPI_COMM_WORLD as default)
1115PETSC_COMM_WORLD = NewComm;
1116```
1117
1118### When solving a system with Dirichlet boundary conditions I can use MatZeroRows() to eliminate the Dirichlet rows but this results in a non-Symmetric system. How can I apply Dirichlet boundary conditions but keep the matrix symmetric?
1119
1120- For nonsymmetric systems put the appropriate boundary solutions in the x vector and use
1121  `MatZeroRows()` followed by `KSPSetOperators()`.
1122- For symmetric problems use `MatZeroRowsColumns()`.
1123- If you have many Dirichlet locations you can use `MatZeroRows()` (**not**
1124  `MatZeroRowsColumns()`) and `-ksp_type preonly -pc_type redistribute` (see
1125  `PCREDISTRIBUTE`) and PETSc will repartition the parallel matrix for load
1126  balancing. In this case the new matrix solved remains symmetric even though
1127  `MatZeroRows()` is used.
1128
1129An alternative approach is, when assembling the matrix (generating values and passing
1130them to the matrix), never to include locations for the Dirichlet grid points in the
1131vector and matrix, instead taking them into account as you put the other values into the
1132load.
1133
1134### How can I get PETSc vectors and matrices to MATLAB or vice versa?
1135
1136There are numerous ways to work with PETSc and MATLAB. All but the first approach
1137require PETSc to be configured with --with-matlab.
1138
11391. To save PETSc `Mat` and `Vec` to files that can be read from MATLAB use
1140   `PetscViewerBinaryOpen()` viewer and `VecView()` or `MatView()` to save objects
1141   for MATLAB and `VecLoad()` and `MatLoad()` to get the objects that MATLAB has
1142   saved. See `share/petsc/matlab/PetscBinaryRead.m` and
1143   `share/petsc/matlab/PetscBinaryWrite.m` for loading and saving the objects in MATLAB.
11442. Using the [MATLAB Engine](https://www.mathworks.com/help/matlab/calling-matlab-engine-from-c-programs-1.html),
1145   allows PETSc to automatically call MATLAB to perform some specific computations. This
1146   does not allow MATLAB to be used interactively by the user. See the
1147   `PetscMatlabEngine`.
11483. You can open a socket connection between MATLAB and PETSc to allow sending objects back
1149   and forth between an interactive MATLAB session and a running PETSc program. See
1150   `PetscViewerSocketOpen()` for access from the PETSc side and
1151   `share/petsc/matlab/PetscReadBinary.m` for access from the MATLAB side.
11524. You can save PETSc `Vec` (**not** `Mat`) with the `PetscViewerMatlabOpen()`
1153   viewer that saves `.mat` files can then be loaded into MATLAB using the `load()` command
1154
1155### How do I get started with Cython so that I can extend petsc4py?
1156
11571. Learn how to [build a Cython module](http://docs.cython.org/src/quickstart/build.html).
11582. Go through the simple [example](https://stackoverflow.com/questions/3046305/simple-wrapping-of-c-code-with-cython). Note
1159   also the next comment that shows how to create numpy arrays in the Cython and pass them
1160   back.
11613. Check out [this page](http://docs.cython.org/src/tutorial/numpy.html) which tells
1162   you how to get fast indexing.
11634. Have a look at the petsc4py [array source](http://code.google.com/p/petsc4py/source/browse/src/PETSc/arraynpy.pxi).
1164
1165### How do I compute a custom norm for KSP to use as a convergence test or for monitoring?
1166
1167You need to call `KSPBuildResidual()` on your `KSP` object and then compute the
1168appropriate norm on the resulting residual. Note that depending on the
1169`KSPSetNormType()` of the method you may not return the same norm as provided by the
1170method. See also `KSPSetPCSide()`.
1171
1172### If I have a sequential program can I use a PETSc parallel solver?
1173
1174:::{important}
1175Do not expect to get great speedups! Much of the speedup gained by using parallel
1176solvers comes from building the underlying matrices and vectors in parallel to begin
1177with. You should see some reduction in the time for the linear solvers.
1178:::
1179
1180Yes, you must set up PETSc with MPI (even though you will not use MPI) with at least the
1181following options:
1182
1183```console
1184$ ./configure --download-superlu_dist --download-parmetis --download-metis --with-openmp
1185```
1186
1187Your compiler must support OpenMP. To have the linear solver run in parallel run your
1188program with
1189
1190```console
1191$ OMP_NUM_THREADS=n ./myprog -pc_type lu -pc_factor_mat_solver superlu_dist
1192```
1193
1194where `n` is the number of threads and should be less than or equal to the number of cores
1195available.
1196
1197:::{note}
1198If your code is MPI parallel you can also use these same options to have SuperLU_dist
1199utilize multiple threads per MPI process for the direct solver. Make sure that the
1200`$OMP_NUM_THREADS` you use per MPI process is less than or equal to the number of
1201cores available for each MPI process. For example if your compute nodes have 6 cores
1202and you use 2 MPI processes per node then set `$OMP_NUM_THREADS` to 2 or 3.
1203:::
1204
1205Another approach that allows using a PETSc parallel solver is to use `PCMPI`.
1206
1207### TS or SNES produces infeasible (out of domain) solutions or states. How can I prevent this?
1208
1209For `TS` call `TSSetFunctionDomainError()`. For both `TS` and `SNES` call
1210`SNESSetFunctionDomainError()` when the solver passes an infeasible (out of domain)
1211solution or state to your routines.
1212
1213If it occurs for DAEs, it is important to insure the algebraic constraints are well
1214satisfied, which can prevent "breakdown" later. Thus, one can try using a tight tolerance
1215for `SNES`, using a direct linear solver (`PCType` of `PCLU`) when possible, and reducing the timestep (or
1216tightening `TS` tolerances for adaptive time stepping).
1217
1218### Can PETSc work with Hermitian matrices?
1219
1220PETSc's support of Hermitian matrices is limited. Many operations and solvers work
1221for symmetric (`MATSBAIJ`) matrices and operations on transpose matrices but there is
1222little direct support for Hermitian matrices and Hermitian transpose (complex conjugate
1223transpose) operations. There is `KSPSolveTranspose()` for solving the transpose of a
1224linear system but no `KSPSolveHermitian()`.
1225
1226For creating known Hermitian matrices:
1227
1228- `MatCreateNormalHermitian()`
1229- `MatCreateHermitianTranspose()`
1230
1231For determining or setting Hermitian status on existing matrices:
1232
1233- `MatIsHermitian()`
1234- `MatIsHermitianKnown()`
1235- `MatIsStructurallySymmetric()`
1236- `MatIsSymmetricKnown()`
1237- `MatIsSymmetric()`
1238- `MatSetOption()` (use with `MAT_SYMMETRIC` or `MAT_HERMITIAN` to assert to PETSc
1239  that either is the case).
1240
1241For performing matrix operations on known Hermitian matrices (note that regular `Mat`
1242functions such as `MatMult()` will of course also work on Hermitian matrices):
1243
1244- `MatMultHermitianTranspose()`
1245- `MatMultHermitianTransposeAdd()` (very limited support)
1246
1247### How can I assemble a bunch of similar matrices?
1248
1249You can first add the values common to all the matrices, then use `MatStoreValues()` to
1250stash the common values. Each iteration you call `MatRetrieveValues()`, then set the
1251unique values in a matrix and assemble.
1252
1253### Can one resize or change the size of PETSc matrices or vectors?
1254
1255No, once the vector or matrices sizes have been set and the matrices or vectors are fully
1256usable one cannot change the size of the matrices or vectors or number of processors they
1257live on. One may create new vectors and copy, for example using `VecScatterCreate()`,
1258the old values from the previous vector.
1259
1260### How can one compute the nullspace of a sparse matrix with MUMPS?
1261
1262Assuming you have an existing matrix $A$ whose nullspace $V$ you want to find:
1263
1264```
1265Mat      F, work, V;
1266PetscInt N, rows;
1267
1268/* Determine factorability */
1269PetscCall(MatGetFactor(A, MATSOLVERMUMPS, MAT_FACTOR_LU, &F));
1270PetscCall(MatGetLocalSize(A, &rows, NULL));
1271
1272/* Set MUMPS options, see MUMPS documentation for more information */
1273PetscCall(MatMumpsSetIcntl(F, 24, 1));
1274PetscCall(MatMumpsSetIcntl(F, 25, 1));
1275
1276/* Perform factorization */
1277PetscCall(MatLUFactorSymbolic(F, A, NULL, NULL, NULL));
1278PetscCall(MatLUFactorNumeric(F, A, NULL));
1279
1280/* This is the dimension of the null space */
1281PetscCall(MatMumpsGetInfog(F, 28, &N));
1282
1283/* This will contain the null space in the columns */
1284PetscCall(MatCreateDense(comm, rows, N, PETSC_DETERMINE, PETSC_DETERMINE, NULL, &V));
1285
1286PetscCall(MatDuplicate(V, MAT_DO_NOT_COPY_VALUES, &work));
1287PetscCall(MatMatSolve(F, work, V));
1288```
1289
1290______________________________________________________________________
1291
1292## Execution
1293
1294### PETSc executables are SO big and take SO long to link
1295
1296:::{note}
1297See {ref}`shared libraries section <doc_faq_sharedlibs>` for more information.
1298:::
1299
1300We find this annoying as well. On most machines PETSc can use shared libraries, so
1301executables should be much smaller, run `configure` with the additional option
1302`--with-shared-libraries` (this is the default). Also, if you have room, compiling and
1303linking PETSc on your machine's `/tmp` disk or similar local disk, rather than over the
1304network will be much faster.
1305
1306### How does PETSc's `-help` option work? Why is it different for different programs?
1307
1308There are 2 ways in which one interacts with the options database:
1309
1310- `PetscOptionsGetXXX()` where `XXX` is some type or data structure (for example
1311  `PetscOptionsGetBool()` or `PetscOptionsGetScalarArray()`). This is a classic
1312  "getter" function, which queries the command line options for a matching option name,
1313  and returns the specified value.
1314- `PetscOptionsXXX()` where `XXX` is some type or data structure (for example
1315  `PetscOptionsBool()` or `PetscOptionsScalarArray()`). This is a so-called "provider"
1316  function. It first records the option name in an internal list of previously encountered
1317  options before calling `PetscOptionsGetXXX()` to query the status of said option.
1318
1319While users generally use the first option, developers will *always* use the second
1320(provider) variant of functions. Thus, as the program runs, it will build up a list of
1321encountered option names which are then printed **in the order of their appearance on the
1322root rank**. Different programs may take different paths through PETSc source code, so
1323they will encounter different providers, and therefore have different `-help` output.
1324
1325### PETSc has so many options for my program that it is hard to keep them straight
1326
1327Running the PETSc program with the option `-help` will print out many of the options. To
1328print the options that have been specified within a program, employ `-options_left` to
1329print any options that the user specified but were not actually used by the program and
1330all options used; this is helpful for detecting typo errors. The PETSc website has a search option,
1331in the upper right hand corner, that quickly finds answers to most PETSc questions.
1332
1333### PETSc automatically handles many of the details in parallel PDE solvers. How can I understand what is really happening within my program?
1334
1335You can use the option `-info` to get more details about the solution process. The
1336option `-log_view` provides details about the distribution of time spent in the various
1337phases of the solution process. You can run with `-ts_view` or `-snes_view` or
1338`-ksp_view` to see what solver options are being used. Run with `-ts_monitor`,
1339`-snes_monitor`, or `-ksp_monitor` to watch convergence of the
1340methods. `-snes_converged_reason` and `-ksp_converged_reason` will indicate why and if
1341the solvers have converged.
1342
1343### Assembling large sparse matrices takes a long time. What can I do to make this process faster? Or MatSetValues() is so slow; what can I do to speed it up?
1344
1345You probably need to do preallocation, as explained in {any}`sec_matsparse`.
1346See also the {ref}`performance chapter <ch_performance>` of the users manual.
1347
1348For GPUs (and even CPUs) you can use `MatSetPreallocationCOO()` and `MatSetValuesCOO()` for more rapid assembly.
1349
1350### How can I generate performance summaries with PETSc?
1351
1352Use this option at runtime:
1353
1354-l
1355
1356og_view
1357
1358Outputs a comprehensive timing, memory consumption, and communications digest
1359for your program. See the {ref}`profiling chapter <ch_profiling>` of the users
1360manual for information on interpreting the summary data.
1361
1362### How do I know the amount of time spent on each level of the multigrid solver/preconditioner?
1363
1364Run with `-log_view` and `-pc_mg_log`
1365
1366### Where do I get the input matrices for the examples?
1367
1368Some examples use `$DATAFILESPATH/matrices/medium` and other files. These test matrices
1369in PETSc binary format can be found in the [datafiles repository](https://gitlab.com/petsc/datafiles).
1370
1371### When I dump some matrices and vectors to binary, I seem to be generating some empty files with `.info` extensions. What's the deal with these?
1372
1373PETSc binary viewers put some additional information into `.info` files like matrix
1374block size. It is harmless but if you *really* don't like it you can use
1375`-viewer_binary_skip_info` or `PetscViewerBinarySkipInfo()`.
1376
1377:::{note}
1378You need to call `PetscViewerBinarySkipInfo()` before
1379`PetscViewerFileSetName()`. In other words you **cannot** use
1380`PetscViewerBinaryOpen()` directly.
1381:::
1382
1383### Why is my parallel solver slower than my sequential solver, or I have poor speed-up?
1384
1385This can happen for many reasons:
1386
13871. Make sure it is truly the time in `KSPSolve()` that is slower (by running the code
1388   with `-log_view`). Often the slower time is in generating the matrix or some other
1389   operation.
13902. There must be enough work for each process to outweigh the communication time. We
1391   recommend an absolute minimum of about 10,000 unknowns per process, better is 20,000 or
1392   more. This is even more true when using multiple GPUs, where you need to have millions
1393   of unknowns per GPU.
13943. Make sure the {ref}`communication speed of the parallel computer
1395   <doc_faq_general_parallel>` is good enough for parallel solvers.
13964. Check the number of solver iterates with the parallel solver against the sequential
1397   solver. Most preconditioners require more iterations when used on more processes, this
1398   is particularly true for block Jacobi (the default parallel preconditioner). You can
1399   try `-pc_type asm` (`PCASM`) its iterations scale a bit better for more
1400   processes. You may also consider multigrid preconditioners like `PCMG` or BoomerAMG
1401   in `PCHYPRE`.
1402
1403(doc_faq_pipelined)=
1404
1405### What steps are necessary to make the pipelined solvers execute efficiently?
1406
1407Pipelined solvers like `KSPPGMRES`, `KSPPIPECG`, `KSPPIPECR`, and `KSPGROPPCG` may
1408require special MPI configuration to effectively overlap reductions with computation. In
1409general, this requires an MPI-3 implementation, an implementation that supports multiple
1410threads, and use of a "progress thread". Consult the documentation from your vendor or
1411computing facility for more details.
1412
1413- Cray MPI MPT-5.6 MPI-3, by setting `$MPICH_MAX_THREAD_SAFETY` to "multiple"
1414  for threads, plus either `$MPICH_ASYNC_PROGRESS` or
1415  `$MPICH_NEMESIS_ASYNC_PROGRESS`. E.g.
1416  ```console
1417  $ export MPICH_MAX_THREAD_SAFETY=multiple
1418  $ export MPICH_ASYNC_PROGRESS=1
1419  $ export MPICH_NEMESIS_ASYNC_PROGRESS=1
1420  ```
1421
1422- MPICH version 3.0 and later implements the MPI-3 standard and the default
1423  configuration supports use of threads. Use of a progress thread is configured by
1424  setting the environment variable `$MPICH_ASYNC_PROGRESS`. E.g.
1425  ```console
1426  $ export MPICH_ASYNC_PROGRESS=1
1427  ```
1428
1429### When using PETSc in single precision mode (`--with-precision=single` when running `configure`) are the operations done in single or double precision?
1430
1431PETSc does **NOT** do any explicit conversion of single precision to double before
1432performing computations; it depends on the hardware and compiler for what happens. For
1433example, the compiler could choose to put the single precision numbers into the usual
1434double precision registers and then use the usual double precision floating point unit. Or
1435it could use SSE2 instructions that work directly on the single precision numbers. It is a
1436bit of a mystery what decisions get made sometimes. There may be compiler flags in some
1437circumstances that can affect this.
1438
1439### Why is Newton's method (SNES) not converging, or converges slowly?
1440
1441Newton's method may not converge for many reasons, here are some of the most common:
1442
1443- The Jacobian is wrong (or correct in sequential but not in parallel).
1444- The linear system is {ref}`not solved <doc_faq_execution_kspconv>` or is not solved
1445  accurately enough.
1446- The Jacobian system has a singularity that the linear solver is not handling.
1447- There is a bug in the function evaluation routine.
1448- The function is not continuous or does not have continuous first derivatives (e.g. phase
1449  change or TVD limiters).
1450- The equations may not have a solution (e.g. limit cycle instead of a steady state) or
1451  there may be a "hill" between the initial guess and the steady state (e.g. reactants
1452  must ignite and burn before reaching a steady state, but the steady-state residual will
1453  be larger during combustion).
1454
1455Here are some of the ways to help debug lack of convergence of Newton:
1456
1457- Run on one processor to see if the problem is only in parallel.
1458
1459- Run with `-info` to get more detailed information on the solution process.
1460
1461- Run with the options
1462
1463  ```text
1464  -snes_monitor -ksp_monitor_true_residual -snes_converged_reason -ksp_converged_reason
1465  ```
1466
1467  - If the linear solve does not converge, check if the Jacobian is correct, then see
1468    {ref}`this question <doc_faq_execution_kspconv>`.
1469  - If the preconditioned residual converges, but the true residual does not, the
1470    preconditioner may be singular.
1471  - If the linear solve converges well, but the line search fails, the Jacobian may be
1472    incorrect.
1473
1474- Run with `-pc_type lu` or `-pc_type svd` to see if the problem is a poor linear
1475  solver.
1476
1477- Run with `-mat_view` or `-mat_view draw` to see if the Jacobian looks reasonable.
1478
1479- Run with `-snes_test_jacobian -snes_test_jacobian_view` to see if the Jacobian you are
1480  using is wrong. Compare the output when you add `-mat_fd_type ds` to see if the result
1481  is sensitive to the choice of differencing parameter.
1482
1483- Run with `-snes_mf_operator -pc_type lu` to see if the Jacobian you are using is
1484  wrong. If the problem is too large for a direct solve, try
1485
1486  ```text
1487  -snes_mf_operator -pc_type ksp -ksp_ksp_rtol 1e-12.
1488  ```
1489
1490  Compare the output when you add `-mat_mffd_type ds` to see if the result is sensitive
1491  to choice of differencing parameter.
1492
1493- Run with `-snes_linesearch_monitor` to see if the line search is failing (this is
1494  usually a sign of a bad Jacobian). Use `-info` in PETSc 3.1 and older versions,
1495  `-snes_ls_monitor` in PETSc 3.2 and `-snes_linesearch_monitor` in PETSc 3.3 and
1496  later.
1497
1498Here are some ways to help the Newton process if everything above checks out:
1499
1500- Run with grid sequencing (`-snes_grid_sequence` if working with a `DM` is all you
1501  need) to generate better initial guess on your finer mesh.
1502
1503- Run with quad precision, i.e.
1504
1505  ```console
1506  $ ./configure --with-precision=__float128 --download-f2cblaslapack
1507  ```
1508
1509  :::{note}
1510  quad precision requires PETSc 3.2 and later and recent versions of the GNU compilers.
1511  :::
1512
1513- Change the units (nondimensionalization), boundary condition scaling, or formulation so
1514  that the Jacobian is better conditioned. See [Buckingham pi theorem](https://en.wikipedia.org/wiki/Buckingham_%CF%80_theorem) and [Dimensional and
1515  Scaling Analysis](https://epubs.siam.org/doi/pdf/10.1137/16M1107127).
1516
1517- Mollify features in the function that do not have continuous first derivatives (often
1518  occurs when there are "if" statements in the residual evaluation, e.g. phase change or
1519  TVD limiters). Use a variational inequality solver (`SNESVINEWTONRSLS`) if the
1520  discontinuities are of fundamental importance.
1521
1522- Try a trust region method (`-ts_type tr`, may have to adjust parameters).
1523
1524- Run with some continuation parameter from a point where you know the solution, see
1525  `TSPSEUDO` for steady-states.
1526
1527- There are homotopy solver packages like PHCpack that can get you all possible solutions
1528  (and tell you that it has found them all) but those are not scalable and cannot solve
1529  anything but small problems.
1530
1531(doc_faq_execution_kspconv)=
1532
1533### Why is the linear solver (KSP) not converging, or converges slowly?
1534
1535:::{tip}
1536Always run with `-ksp_converged_reason -ksp_monitor_true_residual` when trying to
1537learn why a method is not converging!
1538:::
1539
1540Common reasons for KSP not converging are:
1541
1542- A symmetric method is being used for a non-symmetric problem.
1543
1544- The equations are singular by accident (e.g. forgot to impose boundary
1545  conditions). Check this for a small problem using `-pc_type svd -pc_svd_monitor`.
1546
1547- The equations are intentionally singular (e.g. constant null space), but the Krylov
1548  method was not informed, see `MatSetNullSpace()`. Always inform your local Krylov
1549  subspace solver of any change of singularity. Failure to do so will result in the
1550  immediate revocation of your computing and keyboard operator licenses, as well as
1551  a stern talking-to by the nearest Krylov Subspace Method representative.
1552
1553- The equations are intentionally singular and `MatSetNullSpace()` was used, but the
1554  right-hand side is not consistent. You may have to call `MatNullSpaceRemove()` on the
1555  right-hand side before calling `KSPSolve()`. See `MatSetTransposeNullSpace()`.
1556
1557- The equations are indefinite so that standard preconditioners don't work. Usually you
1558  will know this from the physics, but you can check with
1559
1560  ```text
1561  -ksp_compute_eigenvalues -ksp_gmres_restart 1000 -pc_type none
1562  ```
1563
1564  For simple saddle point problems, try
1565
1566  ```text
1567  -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_detect_saddle_point
1568  ```
1569
1570  For more difficult problems, read the literature to find robust methods and ask
1571  <mailto:petsc-users@mcs.anl.gov> or <mailto:petsc-maint@mcs.anl.gov> if you want advice about how to
1572  implement them.
1573
1574- If the method converges in preconditioned residual, but not in true residual, the
1575  preconditioner is likely singular or nearly so. This is common for saddle point problems
1576  (e.g. incompressible flow) or strongly nonsymmetric operators (e.g. low-Mach hyperbolic
1577  problems with large time steps).
1578
1579- The preconditioner is too weak or is unstable. See if `-pc_type asm -sub_pc_type lu`
1580  improves the convergence rate. If GMRES is losing too much progress in the restart, see
1581  if longer restarts help `-ksp_gmres_restart 300`. If a transpose is available, try
1582  `-ksp_type bcgs` or other methods that do not require a restart.
1583
1584  :::{note}
1585  Unfortunately convergence with these methods is frequently erratic.
1586  :::
1587
1588- The preconditioner is nonlinear (e.g. a nested iterative solve), try `-ksp_type
1589  fgmres` or `-ksp_type gcr`.
1590
1591- You are using geometric multigrid, but some equations (often boundary conditions) are
1592  not scaled compatibly between levels. Try `-pc_mg_galerkin` both to algebraically
1593  construct a correctly scaled coarse operator or make sure that all the equations are
1594  scaled in the same way if you want to use rediscretized coarse levels.
1595
1596- The matrix is very ill-conditioned. Check the {ref}`condition number <doc_faq_usage_condnum>`.
1597
1598  - Try to improve it by choosing the relative scaling of components/boundary conditions.
1599  - Try `-ksp_diagonal_scale -ksp_diagonal_scale_fix`.
1600  - Perhaps change the formulation of the problem to produce more friendly algebraic
1601    equations.
1602
1603- Change the units (nondimensionalization), boundary condition scaling, or formulation so
1604  that the Jacobian is better conditioned. See [Buckingham pi theorem](https://en.wikipedia.org/wiki/Buckingham_%CF%80_theorem) and [Dimensional and
1605  Scaling Analysis](https://epubs.siam.org/doi/pdf/10.1137/16M1107127).
1606
1607- Classical Gram-Schmidt is becoming unstable, try `-ksp_gmres_modifiedgramschmidt` or
1608  use a method that orthogonalizes differently, e.g. `-ksp_type gcr`.
1609
1610### I get the error message: Actual argument at (1) to assumed-type dummy is of derived type with type-bound or FINAL procedures
1611
1612Use the following code-snippet:
1613
1614```fortran
1615module context_module
1616#include petsc/finclude/petsc.h
1617use petsc
1618implicit none
1619private
1620type, public ::  context_type
1621  private
1622  PetscInt :: foo
1623contains
1624  procedure, public :: init => context_init
1625end type context_type
1626contains
1627subroutine context_init(self, foo)
1628  class(context_type), intent(in out) :: self
1629  PetscInt, intent(in) :: foo
1630  self%foo = foo
1631end subroutine context_init
1632end module context_module
1633
1634!------------------------------------------------------------------------
1635
1636program test_snes
1637use,intrinsic :: iso_c_binding
1638use petsc
1639use context_module
1640implicit none
1641
1642SNES :: snes
1643type(context_type),target :: context
1644type(c_ptr) :: contextOut
1645PetscErrorCode :: ierr
1646
1647call PetscInitialize(PETSC_NULL_CHARACTER, ierr)
1648call SNESCreate(PETSC_COMM_WORLD, snes, ierr)
1649call context%init(1)
1650
1651contextOut = c_loc(context) ! contextOut is a C pointer on context
1652
1653call SNESSetConvergenceTest(snes, convergence, contextOut, PETSC_NULL_FUNCTION, ierr)
1654
1655call SNESDestroy(snes, ierr)
1656call PetscFinalize(ierr)
1657
1658contains
1659
1660subroutine convergence(snes, num_iterations, xnorm, pnorm,fnorm, reason, contextIn, ierr)
1661SNES, intent(in) :: snes
1662
1663PetscInt, intent(in) :: num_iterations
1664PetscReal, intent(in) :: xnorm, pnorm, fnorm
1665SNESConvergedReason, intent(out) :: reason
1666type(c_ptr), intent(in out) :: contextIn
1667type(context_type), pointer :: context
1668PetscErrorCode, intent(out) :: ierr
1669
1670call c_f_pointer(contextIn,context)  ! convert the C pointer to a Fortran pointer to use context as in the main program
1671reason = 0
1672ierr = 0
1673end subroutine convergence
1674end program test_snes
1675```
1676
1677### In C++ I get a crash on `VecDestroy()` (or some other PETSc object) at the end of the program
1678
1679This can happen when the destructor for a C++ class is automatically called at the end of
1680the program after `PetscFinalize()`. Use the following code-snippet:
1681
1682```
1683main()
1684{
1685  PetscCall(PetscInitialize());
1686  {
1687    your variables
1688    your code
1689
1690    ...   /* all your destructors are called here automatically by C++ so they work correctly */
1691  }
1692  PetscCall(PetscFinalize());
1693  return 0
1694}
1695```
1696
1697______________________________________________________________________
1698
1699## Debugging
1700
1701### What does the message hwloc/linux: Ignoring PCI device with non-16bit domain mean?
1702
1703This is printed by the hwloc library that is used by some MPI implementations. It can be ignored.
1704To prevent the message from always being printed set the environmental variable `HWLOC_HIDE_ERRORS` to 2.
1705For example
1706
1707```
1708export HWLOC_HIDE_ERRORS=2
1709```
1710
1711which can be added to your login profile file such as `~/.bashrc`.
1712
1713### How do I turn off PETSc signal handling so I can use the `-C` Option On `xlf`?
1714
1715Immediately after calling `PetscInitialize()` call `PetscPopSignalHandler()`.
1716
1717Some Fortran compilers including the IBM xlf, xlF etc compilers have a compile option
1718(`-C` for IBM's) that causes all array access in Fortran to be checked that they are
1719in-bounds. This is a great feature but does require that the array dimensions be set
1720explicitly, not with a \*.
1721
1722### How do I debug if `-start_in_debugger` does not work on my machine?
1723
1724The script <https://github.com/Azrael3000/tmpi> can be used to run multiple MPI
1725ranks in the debugger using tmux.
1726
1727On newer macOS machines - one has to be in admin group to be able to use debugger.
1728
1729On newer Ubuntu linux machines - one has to disable `ptrace_scope` with
1730
1731```console
1732$ sudo echo 0 > /proc/sys/kernel/yama/ptrace_scope
1733```
1734
1735to get start in debugger working.
1736
1737If `-start_in_debugger` does not work on your OS, for a uniprocessor job, just
1738try the debugger directly, for example: `gdb ex1`. You can also use [TotalView](https://totalview.io/products/totalview) which is a good graphical parallel debugger.
1739
1740### How do I see where my code is hanging?
1741
1742You can use the `-start_in_debugger` option to start all processes in the debugger (each
1743will come up in its own xterm) or run in [TotalView](https://totalview.io/products/totalview). Then use `cont` (for continue) in each
1744xterm. Once you are sure that the program is hanging, hit control-c in each xterm and then
1745use 'where' to print a stack trace for each process.
1746
1747### How can I inspect PETSc vector and matrix values when in the debugger?
1748
1749I will illustrate this with `gdb`, but it should be similar on other debuggers. You can
1750look at local `Vec` values directly by obtaining the array. For a `Vec` v, we can
1751print all local values using:
1752
1753```console
1754(gdb) p ((Vec_Seq*) v->data)->array[0]@v->map.n
1755```
1756
1757However, this becomes much more complicated for a matrix. Therefore, it is advisable to use the default viewer to look at the object. For a `Vec` v and a `Mat` m, this would be:
1758
1759```console
1760(gdb) call VecView(v, 0)
1761(gdb) call MatView(m, 0)
1762```
1763
1764or with a communicator other than `MPI_COMM_WORLD`:
1765
1766```console
1767(gdb) call MatView(m, PETSC_VIEWER_STDOUT_(m->comm))
1768```
1769
1770Totalview 8.8.0+ has a new feature that allows libraries to provide their own code to
1771display objects in the debugger. Thus in theory each PETSc object, `Vec`, `Mat` etc
1772could have custom code to print values in the object. We have only done this for the most
1773elementary display of `Vec` and `Mat`. See the routine `TV_display_type()` in
1774`src/vec/vec/interface/vector.c` for an example of how these may be written. Contact us
1775if you would like to add more.
1776
1777### How can I find the cause of floating point exceptions like not-a-number (NaN) or infinity?
1778
1779The best way to locate floating point exceptions is to use a debugger. On supported
1780architectures (including Linux and glibc-based systems), just run in a debugger and pass
1781`-fp_trap` to the PETSc application. This will activate signaling exceptions and the
1782debugger will break on the line that first divides by zero or otherwise generates an
1783exceptions.
1784
1785Without a debugger, running with `-fp_trap` in debug mode will only identify the
1786function in which the error occurred, but not the line or the type of exception. If
1787`-fp_trap` is not supported on your architecture, consult the documentation for your
1788debugger since there is likely a way to have it catch exceptions.
1789
1790(error_libimf)=
1791
1792### Error while loading shared libraries: libimf.so: cannot open shared object file: No such file or directory
1793
1794The Intel compilers use shared libraries (like libimf) that cannot be found, by default, at run
1795time. When using the Intel compilers (and running the resulting code) you must make sure
1796that the proper Intel initialization scripts are run. This is usually done by adding some
1797code into your `.cshrc`, `.bashrc`, `.profile` etc file. Sometimes on batch file
1798systems that do now access your initialization files (like .cshrc) you must include the
1799initialization calls in your batch file submission.
1800
1801For example, on my Mac using `csh` I have the following in my `.cshrc` file:
1802
1803```csh
1804source /opt/intel/cc/10.1.012/bin/iccvars.csh
1805source /opt/intel/fc/10.1.012/bin/ifortvars.csh
1806source /opt/intel/idb/10.1.012/bin/idbvars.csh
1807```
1808
1809And in my `.profile` I have
1810
1811```csh
1812source /opt/intel/cc/10.1.012/bin/iccvars.sh
1813source /opt/intel/fc/10.1.012/bin/ifortvars.sh
1814source /opt/intel/idb/10.1.012/bin/idbvars.sh
1815```
1816
1817(object_type_not_set)=
1818
1819### What does "Object Type Not Set: Argument # N" Mean?
1820
1821Many operations on PETSc objects require that the specific type of the object be set before the operations is performed. You must call `XXXSetType()` or `XXXSetFromOptions()` before you make the offending call. For example
1822
1823```
1824Mat A;
1825
1826PetscCall(MatCreate(PETSC_COMM_WORLD, &A));
1827PetscCall(MatSetValues(A,....));
1828```
1829
1830will not work. You must add `MatSetType()` or `MatSetFromOptions()` before the call to `MatSetValues()`. I.e.
1831
1832```
1833Mat A;
1834
1835PetscCall(MatCreate(PETSC_COMM_WORLD, &A));
1836
1837PetscCall(MatSetType(A, MATAIJ));
1838/* Will override MatSetType() */
1839PetscCall(MatSetFromOptions());
1840
1841PetscCall(MatSetValues(A,....));
1842```
1843
1844(split_ownership)=
1845
1846### What does error detected in `PetscSplitOwnership()` about "sum of local lengths ...": mean?
1847
1848In a previous call to `VecSetSizes()`, `MatSetSizes()`, `VecCreateXXX()` or
1849`MatCreateXXX()` you passed in local and global sizes that do not make sense for the
1850correct number of processors. For example if you pass in a local size of 2 and a global
1851size of 100 and run on two processors, this cannot work since the sum of the local sizes
1852is 4, not 100.
1853
1854(valgrind)=
1855
1856### What does "corrupt argument" or "caught signal" Or "SEGV" Or "segmentation violation" Or "bus error" mean? Can I use Valgrind or CUDA Compute Sanitizer to debug memory corruption issues?
1857
1858Sometimes it can mean an argument to a function is invalid. In Fortran this may be caused
1859by forgetting to list an argument in the call, especially the final `PetscErrorCode`.
1860
1861Otherwise it is usually caused by memory corruption; that is somewhere the code is writing
1862out of array bounds. To track this down rerun the debug version of the code with the
1863option `-malloc_debug`. Occasionally the code may crash only with the optimized version,
1864in that case run the optimized version with `-malloc_debug`. If you determine the
1865problem is from memory corruption you can put the macro CHKMEMQ in the code near the crash
1866to determine exactly what line is causing the problem.
1867
1868If `-malloc_debug` does not help: on NVIDIA CUDA systems you can use <https://docs.nvidia.com/compute-sanitizer/ComputeSanitizer/index.html>,
1869for example, `compute-sanitizer --tool memcheck [sanitizer_options] app_name [app_options]`.
1870
1871If `-malloc_debug` does not help: on GNU/Linux (not macOS machines) - you can
1872use [valgrind](http://valgrind.org). Follow the below instructions:
1873
18741. `configure` PETSc with `--download-mpich --with-debugging` (you can use other MPI implementations but most produce spurious Valgrind messages)
1875
18762. Compile your application code with this build of PETSc.
1877
18783. Run with Valgrind.
1879
1880   ```console
1881   $ $PETSC_DIR/lib/petsc/bin/petscmpiexec -valgrind -n NPROC PETSCPROGRAMNAME PROGRAMOPTIONS
1882   ```
1883
1884   or
1885
1886   ```console
1887   $ mpiexec -n NPROC valgrind --tool=memcheck -q --num-callers=20 \
1888   --suppressions=$PETSC_DIR/share/petsc/suppressions/valgrind \
1889   --log-file=valgrind.log.%p PETSCPROGRAMNAME -malloc off PROGRAMOPTIONS
1890   ```
1891
1892:::{note}
1893- option `--with-debugging` enables valgrind to give stack trace with additional
1894  source-file:line-number info.
1895- option `--download-mpich` is valgrind clean, other MPI builds are not valgrind clean.
1896- when `--download-mpich` is used - mpiexec will be in `$PETSC_ARCH/bin`
1897- `--log-file=valgrind.log.%p` option tells valgrind to store the output from each
1898  process in a different file \[as %p i.e PID, is different for each MPI process.
1899- `memcheck` will not find certain array access that violate static array
1900  declarations so if memcheck runs clean you can try the `--tool=exp-ptrcheck`
1901  instead.
1902:::
1903
1904You might also consider using <http://drmemory.org> which has support for GNU/Linux, Apple
1905Mac OS and Microsoft Windows machines. (Note we haven't tried this ourselves).
1906
1907(zeropivot)=
1908
1909### What does "detected zero pivot in LU factorization" mean?
1910
1911A zero pivot in LU, ILU, Cholesky, or ICC sparse factorization does not always mean that
1912the matrix is singular. You can use
1913
1914```text
1915-pc_factor_shift_type nonzero -pc_factor_shift_amount [amount]
1916```
1917
1918or
1919
1920```text
1921-pc_factor_shift_type positive_definite -[level]_pc_factor_shift_type nonzero
1922 -pc_factor_shift_amount [amount]
1923```
1924
1925or
1926
1927```text
1928-[level]_pc_factor_shift_type positive_definite
1929```
1930
1931to prevent the zero pivot. [level] is "sub" when lu, ilu, cholesky, or icc are employed in
1932each individual block of the bjacobi or ASM preconditioner. [level] is "mg_levels" or
1933"mg_coarse" when lu, ilu, cholesky, or icc are used inside multigrid smoothers or to the
1934coarse grid solver. See `PCFactorSetShiftType()`, `PCFactorSetShiftAmount()`.
1935
1936This error can also happen if your matrix is singular, see `MatSetNullSpace()` for how
1937to handle this. If this error occurs in the zeroth row of the matrix, it is likely you
1938have an error in the code that generates the matrix.
1939
1940### You create draw windows or `PETSCVIEWERDRAW` windows or use options `-ksp_monitor draw::draw_lg` or `-snes_monitor draw::draw_lg` and the program seems to run OK but windows never open
1941
1942The libraries were compiled without support for X windows. Make sure that `configure`
1943was run with the option `--with-x`.
1944
1945### The program seems to use more and more memory as it runs, even though you don't think you are allocating more memory
1946
1947Some of the following may be occurring:
1948
1949- You are creating new PETSc objects but never freeing them.
1950- There is a memory leak in PETSc or your code.
1951- Something much more subtle: (if you are using Fortran). When you declare a large array
1952  in Fortran, the operating system does not allocate all the memory pages for that array
1953  until you start using the different locations in the array. Thus, in a code, if at each
1954  step you start using later values in the array your virtual memory usage will "continue"
1955  to increase as measured by `ps` or `top`.
1956- You are running with the `-log`, `-log_mpe`, or `-log_all` option. With these
1957  options, a great deal of logging information is stored in memory until the conclusion of
1958  the run.
1959- You are linking with the MPI profiling libraries; these cause logging of all MPI
1960  activities. Another symptom is at the conclusion of the run it may print some message
1961  about writing log files.
1962
1963The following may help:
1964
1965- Run with the `-malloc_debug` option and `-malloc_view`. Or use `PetscMallocDump()`
1966  and `PetscMallocView()` sprinkled about your code to track memory that is allocated
1967  and not later freed. Use the commands `PetscMallocGetCurrentUsage()` and
1968  `PetscMemoryGetCurrentUsage()` to monitor memory allocated and
1969  `PetscMallocGetMaximumUsage()` and `PetscMemoryGetMaximumUsage()` for total memory
1970  used as the code progresses.
1971- This is just the way Unix works and is harmless.
1972- Do not use the `-log`, `-log_mpe`, or `-log_all` option, or use
1973  `PetscLogEventDeactivate()` or `PetscLogEventDeactivateClass()` to turn off logging of
1974  specific events.
1975- Make sure you do not link with the MPI profiling libraries.
1976
1977### When calling `MatPartitioningApply()` you get a message "Error! Key 16615 Not Found"
1978
1979The graph of the matrix you are using is not symmetric. You must use symmetric matrices
1980for partitioning.
1981
1982### With GMRES at restart the second residual norm printed does not match the first
1983
1984I.e.
1985
1986```text
198726 KSP Residual norm 3.421544615851e-04
198827 KSP Residual norm 2.973675659493e-04
198928 KSP Residual norm 2.588642948270e-04
199029 KSP Residual norm 2.268190747349e-04
199130 KSP Residual norm 1.977245964368e-04
199230 KSP Residual norm 1.994426291979e-04 <----- At restart the residual norm is printed a second time
1993```
1994
1995This is actually not surprising! GMRES computes the norm of the residual at each iteration
1996via a recurrence relation between the norms of the residuals at the previous iterations
1997and quantities computed at the current iteration. It does not compute it via directly
1998$|| b - A x^{n} ||$.
1999
2000Sometimes, especially with an ill-conditioned matrix, or computation of the matrix-vector
2001product via differencing, the residual norms computed by GMRES start to "drift" from the
2002correct values. At the restart, we compute the residual norm directly, hence the "strange
2003stuff," the difference printed. The drifting, if it remains small, is harmless (doesn't
2004affect the accuracy of the solution that GMRES computes).
2005
2006If you use a more powerful preconditioner the drift will often be smaller and less
2007noticeable. Of if you are running matrix-free you may need to tune the matrix-free
2008parameters.
2009
2010### Why do some Krylov methods seem to print two residual norms per iteration?
2011
2012I.e.
2013
2014```text
20151198 KSP Residual norm 1.366052062216e-04
20161198 KSP Residual norm 1.931875025549e-04
20171199 KSP Residual norm 1.366026406067e-04
20181199 KSP Residual norm 1.931819426344e-04
2019```
2020
2021Some Krylov methods, for example `KSPTFQMR`, actually have a "sub-iteration" of size 2
2022inside the loop. Each of the two substeps has its own matrix vector product and
2023application of the preconditioner and updates the residual approximations. This is why you
2024get this "funny" output where it looks like there are two residual norms per
2025iteration. You can also think of it as twice as many iterations.
2026
2027### Unable to locate PETSc dynamic library `libpetsc`
2028
2029When using DYNAMIC libraries - the libraries cannot be moved after they are
2030installed. This could also happen on clusters - where the paths are different on the (run)
2031nodes - than on the (compile) front-end. **Do not use dynamic libraries & shared
2032libraries**. Run `configure` with
2033`--with-shared-libraries=0 --with-dynamic-loading=0`.
2034
2035:::{important}
2036This option has been removed in PETSc v3.5
2037:::
2038
2039### How do I determine what update to PETSc broke my code?
2040
2041If at some point (in PETSc code history) you had a working code - but the latest PETSc
2042code broke it, its possible to determine the PETSc code change that might have caused this
2043behavior. This is achieved by:
2044
2045- Using Git to access PETSc sources
2046- Knowing the Git commit for the known working version of PETSc
2047- Knowing the Git commit for the known broken version of PETSc
2048- Using the [bisect](https://mirrors.edge.kernel.org/pub/software/scm/git/docs/git-bisect.html)
2049  functionality of Git
2050
2051Git bisect can be done as follows:
2052
20531. Get PETSc development (main branch in git) sources
2054
2055   ```console
2056   $ git clone https://gitlab.com/petsc/petsc.git
2057   ```
2058
20592. Find the good and bad markers to start the bisection process. This can be done either
2060   by checking `git log` or `gitk` or <https://gitlab.com/petsc/petsc> or the web
2061   history of petsc-release clones. Lets say the known bad commit is 21af4baa815c and
2062   known good commit is 5ae5ab319844.
2063
20643. Start the bisection process with these known revisions. Build PETSc, and test your code
2065   to confirm known good/bad behavior:
2066
2067   ```console
2068   $ git bisect start 21af4baa815c 5ae5ab319844
2069   ```
2070
2071   build/test, perhaps discover that this new state is bad
2072
2073   ```console
2074   $ git bisect bad
2075   ```
2076
2077   build/test, perhaps discover that this state is good
2078
2079   ```console
2080   $ git bisect good
2081   ```
2082
2083   Now until done - keep bisecting, building PETSc, and testing your code with it and
2084   determine if the code is working or not. After something like 5-15 iterations, `git
2085   bisect` will pin-point the exact code change that resulted in the difference in
2086   application behavior.
2087
2088:::{tip}
2089See [git-bisect(1)](https://mirrors.edge.kernel.org/pub/software/scm/git/docs/git-bisect.html) and the
2090[debugging section of the Git Book](https://git-scm.com/book/en/Git-Tools-Debugging-with-Git) for more debugging tips.
2091:::
2092
2093### How to fix the error "PMIX Error: error in file gds_ds12_lock_pthread.c"?
2094
2095This seems to be an error when using Open MPI and OpenBLAS with threads (or perhaps other
2096packages that use threads).
2097
2098```console
2099$ export PMIX_MCA_gds=hash
2100```
2101
2102Should resolve the problem.
2103
2104______________________________________________________________________
2105
2106(doc_faq_sharedlibs)=
2107
2108## Shared Libraries
2109
2110### Can I install PETSc libraries as shared libraries?
2111
2112Yes. Use
2113
2114```console
2115$ ./configure --with-shared-libraries
2116```
2117
2118### Why should I use shared libraries?
2119
2120When you link to shared libraries, the function symbols from the shared libraries are not
2121copied in the executable. This way the size of the executable is considerably smaller than
2122when using regular libraries. This helps in a couple of ways:
2123
2124- Saves disk space when more than one executable is created
2125- Improves the compile time immensely, because the compiler has to write a much smaller
2126  file (executable) to the disk.
2127
2128### How do I link to the PETSc shared libraries?
2129
2130By default, the compiler should pick up the shared libraries instead of the regular
2131ones. Nothing special should be done for this.
2132
2133### What if I want to link to the regular `.a` library files?
2134
2135You must run `configure` with the option `--with-shared-libraries=0` (you can use a
2136different `$PETSC_ARCH` for this build so you can easily switch between the two).
2137
2138### What do I do if I want to move my executable to a different machine?
2139
2140You would also need to have access to the shared libraries on this new machine. The other
2141alternative is to build the executable without shared libraries by first deleting the
2142shared libraries, and then creating the executable.
2143
2144```{bibliography} /petsc.bib
2145:filter: docname in docnames
2146```
2147