1(doc_faq)= 2 3# FAQ 4 5```{contents} Table Of Contents 6:backlinks: top 7:local: true 8``` 9 10______________________________________________________________________ 11 12## General 13 14### How can I subscribe to the PETSc mailing lists? 15 16See mailing list {ref}`documentation <doc_mail>` 17 18### Any useful books on numerical computing? 19 20[Bueler, PETSc for Partial Differential Equations: Numerical Solutions in C and Python](https://my.siam.org/Store/Product/viewproduct/?ProductId=32850137) 21 22[Oliveira and Stewart, Writing Scientific Software: A Guide to Good Style](https://www.cambridge.org/core/books/writing-scientific-software/23206704175AF868E43FE3FB399C2F53) 23 24(doc_faq_general_parallel)= 25 26### What kind of parallel computers or clusters are needed to use PETSc? Or why do I get little speedup? 27 28:::{important} 29PETSc can be used with any kind of parallel system that supports MPI BUT for any decent 30performance one needs: 31 32- Fast, **low-latency** interconnect; any ethernet (even 10 GigE) simply cannot provide 33 the needed performance. 34- High per-core **memory** performance. Each core needs to 35 have its **own** memory bandwidth of at least 2 or more gigabytes/second. Most modern 36 computers are not bottlenecked by how fast they can perform 37 calculations; rather, they are usually restricted by how quickly they can get their 38 data. 39::: 40 41To obtain good performance it is important that you know your machine! I.e. how many 42compute nodes (generally, how many motherboards), how many memory sockets per node and how 43many cores per memory socket and how much memory bandwidth for each. 44 45If you do not know this and can run MPI programs with mpiexec (that is, you don't have 46batch system), run the following from `$PETSC_DIR`: 47 48```console 49$ make streams [NPMAX=maximum_number_of_mpi_processes_you_plan_to_use] 50``` 51 52This will provide a summary of the bandwidth with different number of MPI 53processes and potential speedups. See {any}`ch_streams` for a detailed discussion. 54 55If you have a batch system: 56 57```console 58$ cd $PETSC_DIR/src/benchmarks/streams 59$ make MPIVersion 60submit MPIVersion to the batch system a number of times with 1, 2, 3, etc. MPI processes 61collecting all of the output from the runs into the single file scaling.log. Copy 62scaling.log into the src/benchmarks/streams directory. 63$ ./process.py createfile ; process.py 64``` 65 66Even if you have enough memory bandwidth if the OS switches processes between cores 67performance can degrade. Smart process to core/socket binding (this just means locking a 68process to a particular core or memory socket) may help you. For example, consider using 69fewer processes than cores and binding processes to separate sockets so that each process 70uses a different memory bus: 71 72- [MPICH2 binding with the Hydra process manager](https://github.com/pmodels/mpich/blob/main/doc/wiki/how_to/Using_the_Hydra_Process_Manager.md#process-core-binding] 73 74 ```console 75 $ mpiexec.hydra -n 4 --binding cpu:sockets 76 ``` 77 78- [Open MPI binding](https://www.open-mpi.org/faq/?category=tuning#using-paffinity) 79 80 ```console 81 $ mpiexec -n 4 --map-by socket --bind-to socket --report-bindings 82 ``` 83 84- `taskset`, part of the [util-linux](https://github.com/karelzak/util-linux) package 85 86 Check `man taskset` for details. Make sure to set affinity for **your** program, 87 **not** for the `mpiexec` program. 88 89- `numactl` 90 91 In addition to task affinity, this tool also allows changing the default memory affinity 92 policy. On Linux, the default policy is to attempt to find memory on the same memory bus 93 that serves the core that a thread is running on when the memory is faulted 94 (not when `malloc()` is called). If local memory is not available, it is found 95 elsewhere, possibly leading to serious memory imbalances. 96 97 The option `--localalloc` allocates memory on the local NUMA node, similar to the 98 `numa_alloc_local()` function in the `libnuma` library. The option 99 `--cpunodebind=nodes` binds the process to a given NUMA node (note that this can be 100 larger or smaller than a CPU (socket); a NUMA node usually has multiple cores). 101 102 The option `--physcpubind=cpus` binds the process to a given processor core (numbered 103 according to `/proc/cpuinfo`, therefore including logical cores if Hyper-threading is 104 enabled). 105 106 With Open MPI, you can use knowledge of the NUMA hierarchy and core numbering on your 107 machine to calculate the correct NUMA node or processor number given the environment 108 variable `OMPI_COMM_WORLD_LOCAL_RANK`. In most cases, it is easier to make mpiexec or 109 a resource manager set affinities. 110 111### What kind of license is PETSc released under? 112 113See licensing {ref}`documentation <doc_license>` 114 115### Why is PETSc written in C, instead of Fortran or C++? 116 117When this decision was made, in the early 1990s, C enabled us to build data structures 118for storing sparse matrices, solver information, 119etc. in ways that Fortran simply did not allow. ANSI C was a complete standard that all 120modern C compilers supported. The language was identical on all machines. C++ was still 121evolving and compilers on different machines were not identical. Using C function pointers 122to provide data encapsulation and polymorphism allowed us to get many of the advantages of 123C++ without using such a large and more complicated language. It would have been natural and 124reasonable to have coded PETSc in C++; we opted to use C instead. 125 126### Does all the PETSc error checking and logging reduce PETSc's efficiency? 127 128No 129 130(doc_faq_maintenance_strats)= 131 132### How do such a small group of people manage to write and maintain such a large and marvelous package as PETSc? 133 1341. **We work very efficiently**. 135 136 - We use powerful editors and programming environments. 137 - Our manual pages are generated automatically from formatted comments in the code, 138 thus alleviating the need for creating and maintaining manual pages. 139 - We employ continuous integration testing of the entire PETSc library on many different 140 machine architectures. This process **significantly** protects (no bug-catching 141 process is perfect) against inadvertently introducing bugs with new additions. Every 142 new feature **must** pass our suite of thousands of tests as well as formal code 143 review before it may be included. 144 1452. **We are very careful in our design (and are constantly revising our design)** 146 147 - PETSc as a package should be easy to use, write, and maintain. Our mantra is to write 148 code like everyone is using it. 149 1503. **We are willing to do the grunt work** 151 152 - PETSc is regularly checked to make sure that all code conforms to our interface 153 design. We will never keep in a bad design decision simply because changing it will 154 require a lot of editing; we do a lot of editing. 155 1564. **We constantly seek out and experiment with new design ideas** 157 158 - We retain the useful ones and discard the rest. All of these decisions are based not 159 just on performance, but also on **practicality**. 160 1615. **Function and variable names must adhere to strict guidelines** 162 163 - Even the rules about capitalization are designed to make it easy to figure out the 164 name of a particular object or routine. Our memories are terrible, so careful 165 consistent naming puts less stress on our limited human RAM. 166 1676. **The PETSc directory tree is designed to make it easy to move throughout the 168 entire package** 169 1707. **We have a rich, robust, and fast bug reporting system** 171 172 - <mailto:petsc-maint@mcs.anl.gov> is always checked, and we pride ourselves on responding 173 quickly and accurately. Email is very lightweight, and so bug reports system retains 174 an archive of all reported problems and fixes, so it is easy to re-find fixes to 175 previously discovered problems. 176 1778. **We contain the complexity of PETSc by using powerful object-oriented programming 178 techniques** 179 180 - Data encapsulation serves to abstract complex data formats or movement to 181 human-readable format. This is why your program cannot, for example, look directly 182 at what is inside the object `Mat`. 183 - Polymorphism makes changing program behavior as easy as possible, and further 184 abstracts the *intent* of your program from what is *written* in code. You call 185 `MatMult()` regardless of whether your matrix is dense, sparse, parallel or 186 sequential; you don't call a different routine for each format. 187 1889. **We try to provide the functionality requested by our users** 189 190### For complex numbers will I get better performance with C++? 191 192To use PETSc with complex numbers you may use the following `configure` options; 193`--with-scalar-type=complex` and either `--with-clanguage=c++` or (the default) 194`--with-clanguage=c`. In our experience they will deliver very similar performance 195(speed), but if one is concerned they should just try both and see if one is faster. 196 197### How come when I run the same program on the same number of processes I get a "different" answer? 198 199Inner products and norms in PETSc are computed using the `MPI_Allreduce()` command. For 200different runs the order at which values arrive at a given process (via MPI) can be in a 201different order, thus the order in which some floating point arithmetic operations are 202performed will be different. Since floating point arithmetic is not 203associative, the computed quantity may be slightly different. 204 205Over a run the many slight differences in the inner products and norms will effect all the 206computed results. It is important to realize that none of the computed answers are any 207less right or wrong (in fact the sequential computation is no more right then the parallel 208ones). All answers are equal, but some are more equal than others. 209 210The discussion above assumes that the exact same algorithm is being used on the different 211number of processes. When the algorithm is different for the different number of processes 212(almost all preconditioner algorithms except Jacobi are different for different number of 213processes) then one expects to see (and does) a greater difference in results for 214different numbers of processes. In some cases (for example block Jacobi preconditioner) it 215may be that the algorithm works for some number of processes and does not work for others. 216 217### How come when I run the same linear solver on a different number of processes it takes a different number of iterations? 218 219The convergence of many of the preconditioners in PETSc including the default parallel 220preconditioner block Jacobi depends on the number of processes. The more processes the 221(slightly) slower convergence it has. This is the nature of iterative solvers, the more 222parallelism means the more "older" information is used in the solution process hence 223slower convergence. 224 225(doc_faq_gpuhowto)= 226 227### Can PETSc use GPUs to speed up computations? 228 229:::{seealso} 230See GPU development {ref}`roadmap <doc_gpu_roadmap>` for the latest information 231regarding the state of PETSc GPU integration. 232 233See GPU install {ref}`documentation <doc_config_accel>` for up-to-date information on 234installing PETSc to use GPU's. 235::: 236 237Quick summary of usage with CUDA: 238 239- The `VecType` `VECSEQCUDA`, `VECMPICUDA`, or `VECCUDA` may be used with 240 `VecSetType()` or `-vec_type seqcuda`, `mpicuda`, or `cuda` when 241 `VecSetFromOptions()` is used. 242- The `MatType` `MATSEQAIJCUSPARSE`, `MATMPIAIJCUSPARSE`, or `MATAIJCUSPARSE` 243 may be used with `MatSetType()` or `-mat_type seqaijcusparse`, `mpiaijcusparse`, or 244 `aijcusparse` when `MatSetFromOptions()` is used. 245- If you are creating the vectors and matrices with a `DM`, you can use `-dm_vec_type 246 cuda` and `-dm_mat_type aijcusparse`. 247 248Quick summary of usage with OpenCL (provided by the ViennaCL library): 249 250- The `VecType` `VECSEQVIENNACL`, `VECMPIVIENNACL`, or `VECVIENNACL` may be used 251 with `VecSetType()` or `-vec_type seqviennacl`, `mpiviennacl`, or `viennacl` 252 when `VecSetFromOptions()` is used. 253- The `MatType` `MATSEQAIJVIENNACL`, `MATMPIAIJVIENNACL`, or `MATAIJVIENNACL` 254 may be used with `MatSetType()` or `-mat_type seqaijviennacl`, `mpiaijviennacl`, or 255 `aijviennacl` when `MatSetFromOptions()` is used. 256- If you are creating the vectors and matrices with a `DM`, you can use `-dm_vec_type 257 viennacl` and `-dm_mat_type aijviennacl`. 258 259General hints: 260 261- It is useful to develop your code with the default vectors and then run production runs 262 with the command line options to use the GPU since debugging on GPUs is difficult. 263- All of the Krylov methods except `KSPIBCGS` run on the GPU. 264- Parts of most preconditioners run directly on the GPU. After setup, `PCGAMG` runs 265 fully on GPUs, without any memory copies between the CPU and GPU. 266 267Some GPU systems (for example many laptops) only run with single precision; thus, PETSc 268must be built with the `configure` option `--with-precision=single`. 269 270(doc_faq_extendedprecision)= 271 272### Can I run PETSc with extended precision? 273 274Yes, with gcc and gfortran. `configure` PETSc using the 275options `--with-precision=__float128` and `--download-f2cblaslapack`. 276 277:::{admonition} Warning 278:class: yellow 279 280External packages are not guaranteed to work in this mode! 281::: 282 283### Why doesn't PETSc use Qd to implement support for extended precision? 284 285We tried really hard but could not. The problem is that the QD c++ classes, though they 286try to, implement the built-in data types of `double` are not native types and cannot 287"just be used" in a general piece of numerical source code. Ratherm the code has to 288rewritten to live within the limitations of QD classes. However PETSc can be built to use 289quad precision, as detailed {ref}`here <doc_faq_extendedprecision>`. 290 291### How do I cite PETSc? 292 293Use {any}`these citations <doc_index_citing_petsc>`. 294 295______________________________________________________________________ 296 297## Installation 298 299### How do I begin using PETSc if the software has already been completely built and installed by someone else? 300 301Assuming that the PETSc libraries have been successfully built for a particular 302architecture and level of optimization, a new user must merely: 303 3041. Set `PETSC_DIR` to the full path of the PETSc home 305 directory. This will be the location of the `configure` script, and usually called 306 "petsc" or some variation of that (for example, /home/username/petsc). 3072. Set `PETSC_ARCH`, which indicates the configuration on which PETSc will be 308 used. Note that `$PETSC_ARCH` is simply a name the installer used when installing 309 the libraries. There will exist a directory within `$PETSC_DIR` that is named after 310 its corresponding `$PETSC_ARCH`. There many be several on a single system, for 311 example "linux-c-debug" for the debug versions compiled by a C compiler or 312 "linux-c-opt" for the optimized version. 313 314:::{admonition} Still Stuck? 315See the {ref}`quick-start tutorial <tut_install>` for a step-by-step guide on 316installing PETSc, in case you have missed a step. 317 318See the users manual section on {ref}`getting started <sec_getting_started>`. 319::: 320 321### The PETSc distribution is SO Large. How can I reduce my disk space usage? 322 3231. The PETSc users manual is provided in PDF format at `$PETSC_DIR/manual.pdf`. You 324 can delete this. 3252. The PETSc test suite contains sample output for many of the examples. These are 326 contained in the PETSc directories `$PETSC_DIR/src/*/tutorials/output` and 327 `$PETSC_DIR/src/*/tests/output`. Once you have run the test examples, you may remove 328 all of these directories to save some disk space. You can locate the largest with 329 e.g. `find . -name output -type d | xargs du -sh | sort -hr` on a Unix-based system. 3303. The debugging versions of the libraries are larger than the optimized versions. In a 331 pinch you can work with the optimized version, although we bid you good luck in 332 finding bugs as it is much easier with the debug version. 333 334### I want to use PETSc only for uniprocessor programs. Must I still install and use a version of MPI? 335 336No, run `configure` with the option `--with-mpi=0` 337 338### Can I install PETSc to not use X windows (either under Unix or Microsoft Windows with GCC)? 339 340Yes. Run `configure` with the additional flag `--with-x=0` 341 342### Why do you use MPI? 343 344MPI is the message-passing standard. Because it is a standard, it will not frequently change over 345time; thus, we do not have to change PETSc every time the provider of the message-passing 346system decides to make an interface change. MPI was carefully designed by experts from 347industry, academia, and government labs to provide the highest quality performance and 348capability. 349 350For example, the careful design of communicators in MPI allows the easy nesting of 351different libraries; no other message-passing system provides this support. All of the 352major parallel computer vendors were involved in the design of MPI and have committed to 353providing quality implementations. 354 355In addition, since MPI is a standard, several different groups have already provided 356complete free implementations. Thus, one does not have to rely on the technical skills of 357one particular group to provide the message-passing libraries. Today, MPI is the only 358practical, portable approach to writing efficient parallel numerical software. 359 360(invalid_mpi_compilers)= 361 362### What do I do if my MPI compiler wrappers are invalid? 363 364Most MPI implementations provide compiler wrappers (such as `mpicc`) which give the 365include and link options necessary to use that version of MPI to the underlying compilers 366. Configuration will fail if these wrappers are either absent or broken in the MPI pointed to by 367`--with-mpi-dir`. You can rerun the configure with the additional option 368`--with-mpi-compilers=0`, which will try to auto-detect working compilers; however, 369these compilers may be incompatible with the particular MPI build. If this fix does not 370work, run with `--with-cc=[your_c_compiler]` where you know `your_c_compiler` works 371with this particular MPI, and likewise for C++ (`--with-cxx=[your_cxx_compiler]`) and Fortran 372(`--with-fc=[your_fortran_compiler]`). 373 374(bit_indices)= 375 376### When should/can I use the `configure` option `--with-64-bit-indices`? 377 378By default the type that PETSc uses to index into arrays and keep sizes of arrays is a 379`PetscInt` defined to be a 32-bit `int`. If your problem: 380 381- Involves more than 2^31 - 1 unknowns (around 2 billion). 382- Your matrix might contain more than 2^31 - 1 nonzeros on a single process. 383 384Then you need to use this option. Otherwise you will get strange crashes. 385 386This option can be used when you are using either 32 or 64-bit pointers. You do not 387need to use this option if you are using 64-bit pointers unless the two conditions above 388hold. 389 390### What if I get an internal compiler error? 391 392You can rebuild the offending file individually with a lower optimization level. **Then 393make sure to complain to the compiler vendor and file a bug report**. For example, if the 394compiler chokes on `src/mat/impls/baij/seq/baijsolvtrannat.c` you can run the following 395from `$PETSC_DIR`: 396 397```console 398$ make -f gmakefile PCC_FLAGS="-O1" $PETSC_ARCH/obj/src/mat/impls/baij/seq/baijsolvtrannat.o 399$ make all 400``` 401 402### How do I enable Python bindings (petsc4py) with PETSc? 403 4041. Install [Cython](https://cython.org/). 4052. `configure` PETSc with the `--with-petsc4py=1` option. 4063. set `PYTHONPATH=$PETSC_DIR/$PETSC_ARCH/lib` 407 408(macos_gfortran)= 409 410### What Fortran compiler do you recommend on macOS? 411 412We recommend using [homebrew](https://brew.sh/) to install [gfortran](https://gcc.gnu.org/wiki/GFortran), see {any}`doc_macos_install` 413 414### How can I find the URL locations of the packages you install using `--download-PACKAGE`? 415 416```console 417$ grep "self.download " $PETSC_DIR/config/BuildSystem/config/packages/*.py 418``` 419 420### How to fix the problem: PETSc was configured with one MPICH (or Open MPI) `mpi.h` version but now appears to be compiling using a different MPICH (or Open MPI) `mpi.h` version 421 422This happens for generally one of two reasons: 423 424- You previously ran `configure` with the option `--download-mpich` (or `--download-openmpi`) 425 but later ran `configure` to use a version of MPI already installed on the 426 machine. Solution: 427 428 ```console 429 $ rm -rf $PETSC_DIR/$PETSC_ARCH 430 $ ./configure --your-args 431 ``` 432 433(mpi_network_misconfigure)= 434 435### What does it mean when `make check` hangs or errors on `PetscOptionsInsertFile()`? 436 437For example: 438 439```none 440Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes 441See https://petsc.org/release/faq/ 442[0]PETSC ERROR: #1 PetscOptionsInsertFile() line 563 in /Users/barrysmith/Src/PETSc/src/sys/objects/options.c 443[0]PETSC ERROR: #2 PetscOptionsInsert() line 720 in /Users/barrysmith/Src/PETSc/src/sys/objects/options.c 444[0]PETSC ERROR: #3 PetscInitialize() line 828 in /Users/barrysmith/Src/PETSc/src/sys/objects/pinit.c 445``` 446 447or 448 449```none 450$ make check 451Running check examples to verify correct installation 452Using PETSC_DIR=/Users/barrysmith/Src/petsc and PETSC_ARCH=arch-fix-mpiexec-hang-2-ranks 453C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process 454PROGRAM SEEMS TO BE HANGING HERE 455``` 456 457This usually occurs when network settings are misconfigured (perhaps due to VPN) resulting in a failure or hang in system call `gethostbyname()`. 458 459- Verify you are using the correct `mpiexec` for the MPI you have linked PETSc with. 460 461- If you have a VPN enabled on your machine, try turning it off and then running `make check` to 462 verify that it is not the VPN playing poorly with MPI. 463 464- If ``ping `hostname` `` (`/sbin/ping` on macOS) fails or hangs do: 465 466 ```none 467 echo 127.0.0.1 `hostname` | sudo tee -a /etc/hosts 468 ``` 469 470 and try `make check` again. 471 472- Try completely disconnecting your machine from the network and see if `make check` then works 473 474- Try the PETSc `configure` option `--download-mpich-device=ch3:nemesis` with `--download-mpich`. 475 476______________________________________________________________________ 477 478## Usage 479 480### How can I redirect PETSc's `stdout` and `stderr` when programming with a GUI interface in Windows Developer Studio or to C++ streams? 481 482To overload just the error messages write your own `MyPrintError()` function that does 483whatever you want (including pop up windows etc) and use it like below. 484 485```c 486extern "C" { 487 int PASCAL WinMain(HINSTANCE,HINSTANCE,LPSTR,int); 488}; 489 490#include <petscsys.h> 491#include <mpi.h> 492 493const char help[] = "Set up from main"; 494 495int MyPrintError(const char error[], ...) 496{ 497 printf("%s", error); 498 return 0; 499} 500 501int main(int ac, char *av[]) 502{ 503 char buf[256]; 504 HINSTANCE inst; 505 PetscErrorCode ierr; 506 507 inst = (HINSTANCE)GetModuleHandle(NULL); 508 PetscErrorPrintf = MyPrintError; 509 510 buf[0] = 0; 511 for (int i = 1; i < ac; ++i) { 512 strcat(buf, av[i]); 513 strcat(buf, " "); 514 } 515 516 PetscCall(PetscInitialize(&ac, &av, NULL, help)); 517 518 return WinMain(inst, NULL, buf, SW_SHOWNORMAL); 519} 520``` 521 522Place this file in the project and compile with this preprocessor definitions: 523 524``` 525WIN32 526_DEBUG 527_CONSOLE 528_MBCS 529USE_PETSC_LOG 530USE_PETSC_BOPT_g 531USE_PETSC_STACK 532_AFXDLL 533``` 534 535And these link options: 536 537``` 538/nologo 539/subsystem:console 540/incremental:yes 541/debug 542/machine:I386 543/nodefaultlib:"libcmtd.lib" 544/nodefaultlib:"libcd.lib" 545/nodefaultlib:"mvcrt.lib" 546/pdbtype:sept 547``` 548 549:::{note} 550The above is compiled and linked as if it was a console program. The linker will search 551for a main, and then from it the `WinMain` will start. This works with MFC templates and 552derived classes too. 553 554When writing a Window's console application you do not need to do anything, the `stdout` 555and `stderr` is automatically output to the console window. 556::: 557 558To change where all PETSc `stdout` and `stderr` go, (you can also reassign 559`PetscVFPrintf()` to handle `stdout` and `stderr` any way you like) write the 560following function: 561 562``` 563PetscErrorCode mypetscvfprintf(FILE *fd, const char format[], va_list Argp) 564{ 565 PetscFunctionBegin; 566 if (fd != stdout && fd != stderr) { /* handle regular files */ 567 PetscCall(PetscVFPrintfDefault(fd, format, Argp)); 568 } else { 569 char buff[1024]; /* Make sure to assign a large enough buffer */ 570 int length; 571 572 PetscCall(PetscVSNPrintf(buff, 1024, format, &length, Argp)); 573 574 /* now send buff to whatever stream or whatever you want */ 575 } 576 PetscFunctionReturn(PETSC_SUCCESS); 577} 578``` 579 580Then assign `PetscVFPrintf = mypetscprintf` before `PetscInitialize()` in your main 581program. 582 583### I want to use Hypre boomerAMG without GMRES but when I run `-pc_type hypre -pc_hypre_type boomeramg -ksp_type preonly` I don't get a very accurate answer! 584 585You should run with `-ksp_type richardson` to have PETSc run several V or W 586cycles. `-ksp_type preonly` causes boomerAMG to use only one V/W cycle. You can control 587how many cycles are used in a single application of the boomerAMG preconditioner with 588`-pc_hypre_boomeramg_max_iter <it>` (the default is 1). You can also control the 589tolerance boomerAMG uses to decide if to stop before `max_iter` with 590`-pc_hypre_boomeramg_tol <tol>` (the default is 1.e-7). Run with `-ksp_view` to see 591all the hypre options used and `-help | grep boomeramg` to see all the command line 592options. 593 594### How do I use PETSc for Domain Decomposition? 595 596PETSc includes Additive Schwarz methods in the suite of preconditioners under the umbrella 597of `PCASM`. These may be activated with the runtime option `-pc_type asm`. Various 598other options may be set, including the degree of overlap `-pc_asm_overlap <number>` the 599type of restriction/extension `-pc_asm_type [basic,restrict,interpolate,none]` sets ASM 600type and several others. You may see the available ASM options by using `-pc_type asm 601-help`. See the procedural interfaces in the manual pages, for example `PCASMType()` 602and check the index of the users manual for `PCASMCreateSubdomains()`. 603 604PETSc also contains a domain decomposition inspired wirebasket or face based two level 605method where the coarse mesh to fine mesh interpolation is defined by solving specific 606local subdomain problems. It currently only works for 3D scalar problems on structured 607grids created with PETSc `DMDA`. See the manual page for `PCEXOTIC` and 608`src/ksp/ksp/tutorials/ex45.c` for an example. 609 610PETSc also contains a balancing Neumann-Neumann type preconditioner, see the manual page 611for `PCBDDC`. This requires matrices be constructed with `MatCreateIS()` via the finite 612element method. See `src/ksp/ksp/tests/ex59.c` for an example. 613 614### You have `MATAIJ` and `MATBAIJ` matrix formats, and `MATSBAIJ` for symmetric storage, how come no `MATSAIJ`? 615 616Just for historical reasons; the `MATSBAIJ` format with blocksize one is just as efficient as 617a `MATSAIJ` would be. 618 619### Can I Create BAIJ matrices with different size blocks for different block rows? 620 621No. The `MATBAIJ` format only supports a single fixed block size on the entire 622matrix. But the `MATAIJ` format automatically searches for matching rows and thus still 623takes advantage of the natural blocks in your matrix to obtain good performance. 624 625:::{note} 626If you use `MATAIJ` you cannot use the `MatSetValuesBlocked()`. 627::: 628 629### How do I access the values of a remote parallel PETSc Vec? 630 6311. On each process create a local `Vec` large enough to hold all the values it wishes to 632 access. 6332. Create a `VecScatter` that scatters from the parallel `Vec` into the local `Vec`. 6343. Use `VecGetArray()` to access the values in the local `Vec`. 635 636For example, assuming we have distributed a vector `vecGlobal` of size $N$ to 637$R$ ranks and each remote rank holds $N/R = m$ values (similarly assume that 638$N$ is cleanly divisible by $R$). We want each rank $r$ to gather the 639first $n$ (also assume $n \leq m$) values from its immediately superior neighbor 640$r+1$ (final rank will retrieve from rank 0). 641 642``` 643Vec vecLocal; 644IS isLocal, isGlobal; 645VecScatter ctx; 646PetscScalar *arr; 647PetscInt N, firstGlobalIndex; 648MPI_Comm comm; 649PetscMPIInt r, R; 650 651/* Create sequential local vector, big enough to hold local portion */ 652PetscCall(VecCreateSeq(PETSC_COMM_SELF, n, &vecLocal)); 653 654/* Create IS to describe where we want to scatter to */ 655PetscCall(ISCreateStride(PETSC_COMM_SELF, n, 0, 1, &isLocal)); 656 657/* Compute the global indices */ 658PetscCall(VecGetSize(vecGlobal, &N)); 659PetscCall(PetscObjectGetComm((PetscObject) vecGlobal, &comm)); 660PetscCallMPI(MPI_Comm_rank(comm, &r)); 661PetscCallMPI(MPI_Comm_size(comm, &R)); 662firstGlobalIndex = r == R-1 ? 0 : (N/R)*(r+1); 663 664/* Create IS that describes where we want to scatter from */ 665PetscCall(ISCreateStride(comm, n, firstGlobalIndex, 1, &isGlobal)); 666 667/* Create the VecScatter context */ 668PetscCall(VecScatterCreate(vecGlobal, isGlobal, vecLocal, isLocal, &ctx)); 669 670/* Gather the values */ 671PetscCall(VecScatterBegin(ctx, vecGlobal, vecLocal, INSERT_VALUES, SCATTER_FORWARD)); 672PetscCall(VecScatterEnd(ctx, vecGlobal, vecLocal, INSERT_VALUES, SCATTER_FORWARD)); 673 674/* Retrieve and do work */ 675PetscCall(VecGetArray(vecLocal, &arr)); 676/* Work */ 677PetscCall(VecRestoreArray(vecLocal, &arr)); 678 679/* Don't forget to clean up */ 680PetscCall(ISDestroy(&isLocal)); 681PetscCall(ISDestroy(&isGlobal)); 682PetscCall(VecScatterDestroy(&ctx)); 683PetscCall(VecDestroy(&vecLocal)); 684``` 685 686(doc_faq_usage_alltoone)= 687 688### How do I collect to a single processor all the values from a parallel PETSc Vec? 689 6901. Create the `VecScatter` context that will do the communication: 691 692 ``` 693 Vec in_par,out_seq; 694 VecScatter ctx; 695 696 PetscCall(VecScatterCreateToAll(in_par, &ctx, &out_seq)); 697 ``` 698 6992. Initiate the communication (this may be repeated if you wish): 700 701 ``` 702 PetscCall(VecScatterBegin(ctx, in_par, out_seq, INSERT_VALUES, SCATTER_FORWARD)); 703 PetscCall(VecScatterEnd(ctx, in_par, out_seq, INSERT_VALUES, SCATTER_FORWARD)); 704 /* May destroy context now if no additional scatters are needed, otherwise reuse it */ 705 PetscCall(VecScatterDestroy(&ctx)); 706 ``` 707 708Note that this simply concatenates in the parallel ordering of the vector (computed by the 709`MPI_Comm_rank` of the owning process). If you are using a `Vec` from 710`DMCreateGlobalVector()` you likely want to first call `DMDAGlobalToNaturalBegin()` 711followed by `DMDAGlobalToNaturalEnd()` to scatter the original `Vec` into the natural 712ordering in a new global `Vec` before calling `VecScatterBegin()`/`VecScatterEnd()` 713to scatter the natural `Vec` onto all processes. 714 715### How do I collect all the values from a parallel PETSc Vec on the 0th rank? 716 717See FAQ entry on collecting to {ref}`an arbitrary processor <doc_faq_usage_alltoone>`, but 718replace 719 720``` 721PetscCall(VecScatterCreateToAll(in_par, &ctx, &out_seq)); 722``` 723 724with 725 726``` 727PetscCall(VecScatterCreateToZero(in_par, &ctx, &out_seq)); 728``` 729 730:::{note} 731The same ordering considerations as discussed in the aforementioned entry also apply 732here. 733::: 734 735### How can I read in or write out a sparse matrix in Matrix Market, Harwell-Boeing, Slapc or other ASCII format? 736 737If you can read or write your matrix using Python or MATLAB/Octave, `PetscBinaryIO` 738modules are provided at `$PETSC_DIR/lib/petsc/bin` for each language that can assist 739with reading and writing. If you just want to convert `MatrixMarket`, you can use: 740 741```console 742$ python -m $PETSC_DIR/lib/petsc/bin/PetscBinaryIO convert matrix.mtx 743``` 744 745To produce `matrix.petsc`. 746 747You can also call the script directly or import it from your Python code. There is also a 748`PETScBinaryIO.jl` Julia package. 749 750For other formats, either adapt one of the above libraries or see the examples in 751`$PETSC_DIR/src/mat/tests`, specifically `ex72.c` or `ex78.c`. You will likely need 752to modify the code slightly to match your required ASCII format. 753 754:::{note} 755Never read or write in parallel an ASCII matrix file. 756 757Instead read in sequentially with a standalone code based on `ex72.c` or `ex78.c` 758then save the matrix with the binary viewer `PetscViewerBinaryOpen()` and load the 759matrix in parallel in your "real" PETSc program with `MatLoad()`. 760 761For writing save with the binary viewer and then load with the sequential code to store 762it as ASCII. 763::: 764 765### Does TSSetFromOptions(), SNESSetFromOptions(), or KSPSetFromOptions() reset all the parameters I previously set or how come do they not seem to work? 766 767If `XXSetFromOptions()` is used (with `-xxx_type aaaa`) to change the type of the 768object then all parameters associated with the previous type are removed. Otherwise it 769does not reset parameters. 770 771`TS/SNES/KSPSetXXX()` commands that set properties for a particular type of object (such 772as `KSPGMRESSetRestart()`) ONLY work if the object is ALREADY of that type. For example, 773with 774 775``` 776KSP ksp; 777 778PetscCall(KSPCreate(PETSC_COMM_WORLD, &ksp)); 779PetscCall(KSPGMRESSetRestart(ksp, 10)); 780``` 781 782the restart will be ignored since the type has not yet been set to `KSPGMRES`. To have 783those values take effect you should do one of the following: 784 785- Allow setting the type from the command line, if it is not on the command line then the 786 default type is automatically set. 787 788``` 789/* Create generic object */ 790XXXCreate(..,&obj); 791/* Must set all settings here, or default */ 792XXXSetFromOptions(obj); 793``` 794 795- Hardwire the type in the code, but allow the user to override it via a subsequent 796 `XXXSetFromOptions()` call. This essentially allows the user to customize what the 797 "default" type to of the object. 798 799``` 800/* Create generic object */ 801XXXCreate(..,&obj); 802/* Set type directly */ 803XXXSetYYYYY(obj,...); 804/* Can always change to different type */ 805XXXSetFromOptions(obj); 806``` 807 808### How do I compile and link my own PETSc application codes and can I use my own `makefile` or rules for compiling code, rather than PETSc's? 809 810See the {ref}`section <sec_writing_application_codes>` of the users manual on writing 811application codes with PETSc. 812 813### Can I use CMake to build my own project that depends on PETSc? 814 815See the {ref}`section <sec_writing_application_codes>` of the users manual on writing 816application codes with PETSc. 817 818### How can I put carriage returns in `PetscPrintf()` statements from Fortran? 819 820You can use the same notation as in C, just put a `\n` in the string. Note that no other C 821format instruction is supported. 822 823Or you can use the Fortran concatenation `//` and `char(10)`; for example `'some 824string'//char(10)//'another string` on the next line. 825 826### How can I implement callbacks using C++ class methods? 827 828Declare the class method static. Static methods do not have a `this` pointer, but the 829`void*` context parameter will usually be cast to a pointer to the class where it can 830serve the same function. 831 832:::{admonition} Remember 833All PETSc callbacks return `PetscErrorCode`. 834::: 835 836### Everyone knows that when you code Newton's Method you should compute the function and its Jacobian at the same time. How can one do this in PETSc? 837 838The update in Newton's method is computed as 839 840$$ 841u^{n+1} = u^n - \lambda * \left[J(u^n)] * F(u^n) \right]^{\dagger} 842$$ 843 844The reason PETSc doesn't default to computing both the function and Jacobian at the same 845time is: 846 847- In order to do the line search $F \left(u^n - \lambda * \text{step} \right)$ may 848 need to be computed for several $\lambda$. The Jacobian is not needed for each of 849 those and one does not know in advance which will be the final $\lambda$ until 850 after the function value is computed, so many extra Jacobians may be computed. 851- In the final step if $|| F(u^p)||$ satisfies the convergence criteria then a 852 Jacobian need not be computed. 853 854You are free to have your `FormFunction()` compute as much of the Jacobian at that point 855as you like, keep the information in the user context (the final argument to 856`FormFunction()` and `FormJacobian()`) and then retrieve the information in your 857`FormJacobian()` function. 858 859### Computing the Jacobian or preconditioner is time consuming. Is there any way to compute it less often? 860 861PETSc has a variety of ways of lagging the computation of the Jacobian or the 862preconditioner. They are documented in the manual page for `SNESComputeJacobian()` 863and in the {ref}`users manual <ch_snes>`: 864 865-s 866 867nes_lag_jacobian 868 869(`SNESSetLagJacobian()`) How often Jacobian is rebuilt (use -1 to 870never rebuild, use -2 to rebuild the next time requested and then 871never again). 872 873-s 874 875nes_lag_jacobian_persists 876 877(`SNESSetLagJacobianPersists()`) Forces lagging of Jacobian 878through multiple `SNES` solves , same as passing -2 to 879`-snes_lag_jacobian`. By default, each new `SNES` solve 880normally triggers a recomputation of the Jacobian. 881 882-s 883 884nes_lag_preconditioner 885 886(`SNESSetLagPreconditioner()`) how often the preconditioner is 887rebuilt. Note: if you are lagging the Jacobian the system will 888know that the matrix has not changed and will not recompute the 889(same) preconditioner. 890 891-s 892 893nes_lag_preconditioner_persists 894 895(`SNESSetLagPreconditionerPersists()`) Preconditioner 896lags through multiple `SNES` solves 897 898:::{note} 899These are often (but does not need to be) used in combination with 900`-snes_mf_operator` which applies the fresh Jacobian matrix-free for every 901matrix-vector product. Otherwise the out-of-date matrix vector product, computed with 902the lagged Jacobian will be used. 903::: 904 905By using `KSPMonitorSet()` and/or `SNESMonitorSet()` one can provide code that monitors the 906convergence rate and automatically triggers an update of the Jacobian or preconditioner 907based on decreasing convergence of the iterative method. For example if the number of `SNES` 908iterations doubles one might trigger a new computation of the Jacobian. Experimentation is 909the only general purpose way to determine which approach is best for your problem. 910 911:::{important} 912It is also vital to experiment on your true problem at the scale you will be solving 913the problem since the performance benefits depend on the exact problem and the problem 914size! 915::: 916 917### How can I use Newton's Method Jacobian free? Can I difference a different function than provided with `SNESSetFunction()`? 918 919The simplest way is with the option `-snes_mf`, this will use finite differencing of the 920function provided to `SNESComputeFunction()` to approximate the action of Jacobian. 921 922:::{important} 923Since no matrix-representation of the Jacobian is provided the `-pc_type` used with 924this option must be `-pc_type none`. You may provide a custom preconditioner with 925`SNESGetKSP()`, `KSPGetPC()`, and `PCSetType()` and use `PCSHELL`. 926::: 927 928The option `-snes_mf_operator` will use Jacobian free to apply the Jacobian (in the 929Krylov solvers) but will use whatever matrix you provided with `SNESSetJacobian()` 930(assuming you set one) to compute the preconditioner. 931 932To write the code (rather than use the options above) use `MatCreateSNESMF()` and pass 933the resulting matrix object to `SNESSetJacobian()`. 934 935For purely matrix-free (like `-snes_mf`) pass the matrix object for both matrix 936arguments and pass the function `MatMFFDComputeJacobian()`. 937 938To provide your own approximate Jacobian matrix to compute the preconditioner (like 939`-snes_mf_operator`), pass this other matrix as the second matrix argument to 940`SNESSetJacobian()`. Make sure your provided `computejacobian()` function calls 941`MatAssemblyBegin()` and `MatAssemblyEnd()` separately on **BOTH** matrix arguments 942to this function. See `src/snes/tests/ex7.c`. 943 944To difference a different function than that passed to `SNESSetJacobian()` to compute the 945matrix-free Jacobian multiply call `MatMFFDSetFunction()` to set that other function. See 946`src/snes/tests/ex7.c.h`. 947 948(doc_faq_usage_condnum)= 949 950### How can I determine the condition number of a matrix? 951 952For small matrices, the condition number can be reliably computed using 953 954```text 955-pc_type svd -pc_svd_monitor 956``` 957 958For larger matrices, you can run with 959 960```text 961-pc_type none -ksp_type gmres -ksp_monitor_singular_value -ksp_gmres_restart 1000 962``` 963 964to get approximations to the condition number of the operator. This will generally be 965accurate for the largest singular values, but may overestimate the smallest singular value 966unless the method has converged. Make sure to avoid restarts. To estimate the condition 967number of the preconditioned operator, use `-pc_type somepc` in the last command. 968 969You can use [SLEPc](https://slepc.upv.es) for highly scalable, efficient, and quality eigenvalue computations. 970 971### How can I compute the inverse of a matrix in PETSc? 972 973:::{admonition} Are you sure? 974:class: yellow 975 976It is very expensive to compute the inverse of a matrix and very rarely needed in 977practice. We highly recommend avoiding algorithms that need it. 978::: 979 980The inverse of a matrix (dense or sparse) is essentially always dense, so begin by 981creating a dense matrix B and fill it with the identity matrix (ones along the diagonal), 982also create a dense matrix X of the same size that will hold the solution. Then factor the 983matrix you wish to invert with `MatLUFactor()` or `MatCholeskyFactor()`, call the 984result A. Then call `MatMatSolve(A,B,X)` to compute the inverse into X. See also section 985on {any}`Schur's complement <how_can_i_compute_the_schur_complement>`. 986 987(how_can_i_compute_the_schur_complement)= 988 989### How can I compute the Schur complement in PETSc? 990 991:::{admonition} Are you sure? 992:class: yellow 993 994It is very expensive to compute the Schur complement of a matrix and very rarely needed 995in practice. We highly recommend avoiding algorithms that need it. 996::: 997 998The Schur complement of the matrix $M \in \mathbb{R}^{\left(p+q \right) \times 999\left(p + q \right)}$ 1000 1001$$ 1002M = \begin{bmatrix} 1003A & B \\ 1004C & D 1005\end{bmatrix} 1006$$ 1007 1008where 1009 1010$$ 1011A \in \mathbb{R}^{p \times p}, \quad B \in \mathbb{R}^{p \times q}, \quad C \in \mathbb{R}^{q \times p}, \quad D \in \mathbb{R}^{q \times q} 1012$$ 1013 1014is given by 1015 1016$$ 1017S_D := A - BD^{-1}C \\ 1018S_A := D - CA^{-1}B 1019$$ 1020 1021Like the inverse, the Schur complement of a matrix (dense or sparse) is essentially always 1022dense, so assuming you wish to calculate $S_A = D - C \underbrace{ 1023\overbrace{(A^{-1})}^{U} B}_{V}$ begin by: 1024 10251. Forming a dense matrix $B$ 10262. Also create another dense matrix $V$ of the same size. 10273. Then either factor the matrix $A$ directly with `MatLUFactor()` or 1028 `MatCholeskyFactor()`, or use `MatGetFactor()` followed by 1029 `MatLUFactorSymbolic()` followed by `MatLUFactorNumeric()` if you wish to use and 1030 external solver package like SuperLU_Dist. Call the result $U$. 10314. Then call `MatMatSolve(U,B,V)`. 10325. Then call `MatMatMult(C,V,MAT_INITIAL_MATRIX,1.0,&S)`. 10336. Now call `MatAXPY(S,-1.0,D,MAT_SUBSET_NONZERO)`. 10347. Followed by `MatScale(S,-1.0)`. 1035 1036For computing Schur complements like this it does not make sense to use the `KSP` 1037iterative solvers since for solving many moderate size problems using a direct 1038factorization is much faster than iterative solvers. As you can see, this requires a great 1039deal of work space and computation so is best avoided. 1040 1041However, it is not necessary to assemble the Schur complement $S$ in order to solve 1042systems with it. Use `MatCreateSchurComplement(A,A_pre,B,C,D,&S)` to create a 1043matrix that applies the action of $S$ (using `A_pre` to solve with `A`), but 1044does not assemble. 1045 1046Alternatively, if you already have a block matrix `M = [A, B; C, D]` (in some 1047ordering), then you can create index sets (`IS`) `isa` and `isb` to address each 1048block, then use `MatGetSchurComplement()` to create the Schur complement and/or an 1049approximation suitable for preconditioning. 1050 1051Since $S$ is generally dense, standard preconditioning methods cannot typically be 1052applied directly to Schur complements. There are many approaches to preconditioning Schur 1053complements including using the `SIMPLE` approximation 1054 1055$$ 1056D - C \text{diag}(A)^{-1} B 1057$$ 1058 1059to create a sparse matrix that approximates the Schur complement (this is returned by 1060default for the optional "preconditioning" matrix in `MatGetSchurComplement()`). 1061 1062An alternative is to interpret the matrices as differential operators and apply 1063approximate commutator arguments to find a spectrally equivalent operation that can be 1064applied efficiently (see the "PCD" preconditioners {cite}`elman_silvester_wathen_2014`). A 1065variant of this is the least squares commutator, which is closely related to the 1066Moore-Penrose pseudoinverse, and is available in `PCLSC` which operates on matrices of 1067type `MATSCHURCOMPLEMENT`. 1068 1069### Do you have examples of doing unstructured grid Finite Element Method (FEM) with PETSc? 1070 1071There are at least three ways to write finite element codes using PETSc: 1072 10731. Use `DMPLEX`, which is a high level approach to manage your mesh and 1074 discretization. See the {ref}`tutorials sections <tut_stokes>` for further information, 1075 or see `src/snes/tutorial/ex62.c`. 10762. Use packages such as [deal.ii](https://www.dealii.org), [libMesh](https://libmesh.github.io), or 1077 [Firedrake](https://www.firedrakeproject.org), which use PETSc for the solvers. 10783. Manage the grid data structure yourself and use PETSc `PetscSF`, `IS`, and `VecScatter` to 1079 communicate the required ghost point communication. See 1080 `src/snes/tutorials/ex10d/ex10.c`. 1081 1082### DMDA decomposes the domain differently than the MPI_Cart_create() command. How can one use them together? 1083 1084The `MPI_Cart_create()` first divides the mesh along the z direction, then the y, then 1085the x. `DMDA` divides along the x, then y, then z. Thus, for example, rank 1 of the 1086processes will be in a different part of the mesh for the two schemes. To resolve this you 1087can create a new MPI communicator that you pass to `DMDACreate()` that renumbers the 1088process ranks so that each physical process shares the same part of the mesh with both the 1089`DMDA` and the `MPI_Cart_create()`. The code to determine the new numbering was 1090provided by Rolf Kuiper: 1091 1092``` 1093// the numbers of processors per direction are (int) x_procs, y_procs, z_procs respectively 1094// (no parallelization in direction 'dir' means dir_procs = 1) 1095 1096MPI_Comm NewComm; 1097int x, y, z; 1098PetscMPIInt MPI_Rank, NewRank; 1099 1100// get rank from MPI ordering: 1101PetscCallMPI(MPI_Comm_rank(MPI_COMM_WORLD, &MPI_Rank)); 1102 1103// calculate coordinates of cpus in MPI ordering: 1104x = MPI_rank / (z_procs*y_procs); 1105y = (MPI_rank % (z_procs*y_procs)) / z_procs; 1106z = (MPI_rank % (z_procs*y_procs)) % z_procs; 1107 1108// set new rank according to PETSc ordering: 1109NewRank = z*y_procs*x_procs + y*x_procs + x; 1110 1111// create communicator with new ranks according to PETSc ordering 1112PetscCallMPI(MPI_Comm_split(PETSC_COMM_WORLD, 1, NewRank, &NewComm)); 1113 1114// override the default communicator (was MPI_COMM_WORLD as default) 1115PETSC_COMM_WORLD = NewComm; 1116``` 1117 1118### When solving a system with Dirichlet boundary conditions I can use MatZeroRows() to eliminate the Dirichlet rows but this results in a non-Symmetric system. How can I apply Dirichlet boundary conditions but keep the matrix symmetric? 1119 1120- For nonsymmetric systems put the appropriate boundary solutions in the x vector and use 1121 `MatZeroRows()` followed by `KSPSetOperators()`. 1122- For symmetric problems use `MatZeroRowsColumns()`. 1123- If you have many Dirichlet locations you can use `MatZeroRows()` (**not** 1124 `MatZeroRowsColumns()`) and `-ksp_type preonly -pc_type redistribute` (see 1125 `PCREDISTRIBUTE`) and PETSc will repartition the parallel matrix for load 1126 balancing. In this case the new matrix solved remains symmetric even though 1127 `MatZeroRows()` is used. 1128 1129An alternative approach is, when assembling the matrix (generating values and passing 1130them to the matrix), never to include locations for the Dirichlet grid points in the 1131vector and matrix, instead taking them into account as you put the other values into the 1132load. 1133 1134### How can I get PETSc vectors and matrices to MATLAB or vice versa? 1135 1136There are numerous ways to work with PETSc and MATLAB. All but the first approach 1137require PETSc to be configured with --with-matlab. 1138 11391. To save PETSc `Mat` and `Vec` to files that can be read from MATLAB use 1140 `PetscViewerBinaryOpen()` viewer and `VecView()` or `MatView()` to save objects 1141 for MATLAB and `VecLoad()` and `MatLoad()` to get the objects that MATLAB has 1142 saved. See `share/petsc/matlab/PetscBinaryRead.m` and 1143 `share/petsc/matlab/PetscBinaryWrite.m` for loading and saving the objects in MATLAB. 11442. Using the [MATLAB Engine](https://www.mathworks.com/help/matlab/calling-matlab-engine-from-c-programs-1.html), 1145 allows PETSc to automatically call MATLAB to perform some specific computations. This 1146 does not allow MATLAB to be used interactively by the user. See the 1147 `PetscMatlabEngine`. 11483. You can open a socket connection between MATLAB and PETSc to allow sending objects back 1149 and forth between an interactive MATLAB session and a running PETSc program. See 1150 `PetscViewerSocketOpen()` for access from the PETSc side and 1151 `share/petsc/matlab/PetscReadBinary.m` for access from the MATLAB side. 11524. You can save PETSc `Vec` (**not** `Mat`) with the `PetscViewerMatlabOpen()` 1153 viewer that saves `.mat` files can then be loaded into MATLAB using the `load()` command 1154 1155### How do I get started with Cython so that I can extend petsc4py? 1156 11571. Learn how to [build a Cython module](http://docs.cython.org/src/quickstart/build.html). 11582. Go through the simple [example](https://stackoverflow.com/questions/3046305/simple-wrapping-of-c-code-with-cython). Note 1159 also the next comment that shows how to create numpy arrays in the Cython and pass them 1160 back. 11613. Check out [this page](http://docs.cython.org/src/tutorial/numpy.html) which tells 1162 you how to get fast indexing. 11634. Have a look at the petsc4py [array source](http://code.google.com/p/petsc4py/source/browse/src/PETSc/arraynpy.pxi). 1164 1165### How do I compute a custom norm for KSP to use as a convergence test or for monitoring? 1166 1167You need to call `KSPBuildResidual()` on your `KSP` object and then compute the 1168appropriate norm on the resulting residual. Note that depending on the 1169`KSPSetNormType()` of the method you may not return the same norm as provided by the 1170method. See also `KSPSetPCSide()`. 1171 1172### If I have a sequential program can I use a PETSc parallel solver? 1173 1174:::{important} 1175Do not expect to get great speedups! Much of the speedup gained by using parallel 1176solvers comes from building the underlying matrices and vectors in parallel to begin 1177with. You should see some reduction in the time for the linear solvers. 1178::: 1179 1180Yes, you must set up PETSc with MPI (even though you will not use MPI) with at least the 1181following options: 1182 1183```console 1184$ ./configure --download-superlu_dist --download-parmetis --download-metis --with-openmp 1185``` 1186 1187Your compiler must support OpenMP. To have the linear solver run in parallel run your 1188program with 1189 1190```console 1191$ OMP_NUM_THREADS=n ./myprog -pc_type lu -pc_factor_mat_solver superlu_dist 1192``` 1193 1194where `n` is the number of threads and should be less than or equal to the number of cores 1195available. 1196 1197:::{note} 1198If your code is MPI parallel you can also use these same options to have SuperLU_dist 1199utilize multiple threads per MPI process for the direct solver. Make sure that the 1200`$OMP_NUM_THREADS` you use per MPI process is less than or equal to the number of 1201cores available for each MPI process. For example if your compute nodes have 6 cores 1202and you use 2 MPI processes per node then set `$OMP_NUM_THREADS` to 2 or 3. 1203::: 1204 1205Another approach that allows using a PETSc parallel solver is to use `PCMPI`. 1206 1207### TS or SNES produces infeasible (out of domain) solutions or states. How can I prevent this? 1208 1209For `TS` call `TSSetFunctionDomainError()`. For both `TS` and `SNES` call 1210`SNESSetFunctionDomainError()` when the solver passes an infeasible (out of domain) 1211solution or state to your routines. 1212 1213If it occurs for DAEs, it is important to insure the algebraic constraints are well 1214satisfied, which can prevent "breakdown" later. Thus, one can try using a tight tolerance 1215for `SNES`, using a direct linear solver (`PCType` of `PCLU`) when possible, and reducing the timestep (or 1216tightening `TS` tolerances for adaptive time stepping). 1217 1218### Can PETSc work with Hermitian matrices? 1219 1220PETSc's support of Hermitian matrices is limited. Many operations and solvers work 1221for symmetric (`MATSBAIJ`) matrices and operations on transpose matrices but there is 1222little direct support for Hermitian matrices and Hermitian transpose (complex conjugate 1223transpose) operations. There is `KSPSolveTranspose()` for solving the transpose of a 1224linear system but no `KSPSolveHermitian()`. 1225 1226For creating known Hermitian matrices: 1227 1228- `MatCreateNormalHermitian()` 1229- `MatCreateHermitianTranspose()` 1230 1231For determining or setting Hermitian status on existing matrices: 1232 1233- `MatIsHermitian()` 1234- `MatIsHermitianKnown()` 1235- `MatIsStructurallySymmetric()` 1236- `MatIsSymmetricKnown()` 1237- `MatIsSymmetric()` 1238- `MatSetOption()` (use with `MAT_SYMMETRIC` or `MAT_HERMITIAN` to assert to PETSc 1239 that either is the case). 1240 1241For performing matrix operations on known Hermitian matrices (note that regular `Mat` 1242functions such as `MatMult()` will of course also work on Hermitian matrices): 1243 1244- `MatMultHermitianTranspose()` 1245- `MatMultHermitianTransposeAdd()` (very limited support) 1246 1247### How can I assemble a bunch of similar matrices? 1248 1249You can first add the values common to all the matrices, then use `MatStoreValues()` to 1250stash the common values. Each iteration you call `MatRetrieveValues()`, then set the 1251unique values in a matrix and assemble. 1252 1253### Can one resize or change the size of PETSc matrices or vectors? 1254 1255No, once the vector or matrices sizes have been set and the matrices or vectors are fully 1256usable one cannot change the size of the matrices or vectors or number of processors they 1257live on. One may create new vectors and copy, for example using `VecScatterCreate()`, 1258the old values from the previous vector. 1259 1260### How can one compute the nullspace of a sparse matrix with MUMPS? 1261 1262Assuming you have an existing matrix $A$ whose nullspace $V$ you want to find: 1263 1264``` 1265Mat F, work, V; 1266PetscInt N, rows; 1267 1268/* Determine factorability */ 1269PetscCall(MatGetFactor(A, MATSOLVERMUMPS, MAT_FACTOR_LU, &F)); 1270PetscCall(MatGetLocalSize(A, &rows, NULL)); 1271 1272/* Set MUMPS options, see MUMPS documentation for more information */ 1273PetscCall(MatMumpsSetIcntl(F, 24, 1)); 1274PetscCall(MatMumpsSetIcntl(F, 25, 1)); 1275 1276/* Perform factorization */ 1277PetscCall(MatLUFactorSymbolic(F, A, NULL, NULL, NULL)); 1278PetscCall(MatLUFactorNumeric(F, A, NULL)); 1279 1280/* This is the dimension of the null space */ 1281PetscCall(MatMumpsGetInfog(F, 28, &N)); 1282 1283/* This will contain the null space in the columns */ 1284PetscCall(MatCreateDense(comm, rows, N, PETSC_DETERMINE, PETSC_DETERMINE, NULL, &V)); 1285 1286PetscCall(MatDuplicate(V, MAT_DO_NOT_COPY_VALUES, &work)); 1287PetscCall(MatMatSolve(F, work, V)); 1288``` 1289 1290______________________________________________________________________ 1291 1292## Execution 1293 1294### PETSc executables are SO big and take SO long to link 1295 1296:::{note} 1297See {ref}`shared libraries section <doc_faq_sharedlibs>` for more information. 1298::: 1299 1300We find this annoying as well. On most machines PETSc can use shared libraries, so 1301executables should be much smaller, run `configure` with the additional option 1302`--with-shared-libraries` (this is the default). Also, if you have room, compiling and 1303linking PETSc on your machine's `/tmp` disk or similar local disk, rather than over the 1304network will be much faster. 1305 1306### How does PETSc's `-help` option work? Why is it different for different programs? 1307 1308There are 2 ways in which one interacts with the options database: 1309 1310- `PetscOptionsGetXXX()` where `XXX` is some type or data structure (for example 1311 `PetscOptionsGetBool()` or `PetscOptionsGetScalarArray()`). This is a classic 1312 "getter" function, which queries the command line options for a matching option name, 1313 and returns the specified value. 1314- `PetscOptionsXXX()` where `XXX` is some type or data structure (for example 1315 `PetscOptionsBool()` or `PetscOptionsScalarArray()`). This is a so-called "provider" 1316 function. It first records the option name in an internal list of previously encountered 1317 options before calling `PetscOptionsGetXXX()` to query the status of said option. 1318 1319While users generally use the first option, developers will *always* use the second 1320(provider) variant of functions. Thus, as the program runs, it will build up a list of 1321encountered option names which are then printed **in the order of their appearance on the 1322root rank**. Different programs may take different paths through PETSc source code, so 1323they will encounter different providers, and therefore have different `-help` output. 1324 1325### PETSc has so many options for my program that it is hard to keep them straight 1326 1327Running the PETSc program with the option `-help` will print out many of the options. To 1328print the options that have been specified within a program, employ `-options_left` to 1329print any options that the user specified but were not actually used by the program and 1330all options used; this is helpful for detecting typo errors. The PETSc website has a search option, 1331in the upper right hand corner, that quickly finds answers to most PETSc questions. 1332 1333### PETSc automatically handles many of the details in parallel PDE solvers. How can I understand what is really happening within my program? 1334 1335You can use the option `-info` to get more details about the solution process. The 1336option `-log_view` provides details about the distribution of time spent in the various 1337phases of the solution process. You can run with `-ts_view` or `-snes_view` or 1338`-ksp_view` to see what solver options are being used. Run with `-ts_monitor`, 1339`-snes_monitor`, or `-ksp_monitor` to watch convergence of the 1340methods. `-snes_converged_reason` and `-ksp_converged_reason` will indicate why and if 1341the solvers have converged. 1342 1343### Assembling large sparse matrices takes a long time. What can I do to make this process faster? Or MatSetValues() is so slow; what can I do to speed it up? 1344 1345You probably need to do preallocation, as explained in {any}`sec_matsparse`. 1346See also the {ref}`performance chapter <ch_performance>` of the users manual. 1347 1348For GPUs (and even CPUs) you can use `MatSetPreallocationCOO()` and `MatSetValuesCOO()` for more rapid assembly. 1349 1350### How can I generate performance summaries with PETSc? 1351 1352Use this option at runtime: 1353 1354-l 1355 1356og_view 1357 1358Outputs a comprehensive timing, memory consumption, and communications digest 1359for your program. See the {ref}`profiling chapter <ch_profiling>` of the users 1360manual for information on interpreting the summary data. 1361 1362### How do I know the amount of time spent on each level of the multigrid solver/preconditioner? 1363 1364Run with `-log_view` and `-pc_mg_log` 1365 1366### Where do I get the input matrices for the examples? 1367 1368Some examples use `$DATAFILESPATH/matrices/medium` and other files. These test matrices 1369in PETSc binary format can be found in the [datafiles repository](https://gitlab.com/petsc/datafiles). 1370 1371### When I dump some matrices and vectors to binary, I seem to be generating some empty files with `.info` extensions. What's the deal with these? 1372 1373PETSc binary viewers put some additional information into `.info` files like matrix 1374block size. It is harmless but if you *really* don't like it you can use 1375`-viewer_binary_skip_info` or `PetscViewerBinarySkipInfo()`. 1376 1377:::{note} 1378You need to call `PetscViewerBinarySkipInfo()` before 1379`PetscViewerFileSetName()`. In other words you **cannot** use 1380`PetscViewerBinaryOpen()` directly. 1381::: 1382 1383### Why is my parallel solver slower than my sequential solver, or I have poor speed-up? 1384 1385This can happen for many reasons: 1386 13871. Make sure it is truly the time in `KSPSolve()` that is slower (by running the code 1388 with `-log_view`). Often the slower time is in generating the matrix or some other 1389 operation. 13902. There must be enough work for each process to outweigh the communication time. We 1391 recommend an absolute minimum of about 10,000 unknowns per process, better is 20,000 or 1392 more. This is even more true when using multiple GPUs, where you need to have millions 1393 of unknowns per GPU. 13943. Make sure the {ref}`communication speed of the parallel computer 1395 <doc_faq_general_parallel>` is good enough for parallel solvers. 13964. Check the number of solver iterates with the parallel solver against the sequential 1397 solver. Most preconditioners require more iterations when used on more processes, this 1398 is particularly true for block Jacobi (the default parallel preconditioner). You can 1399 try `-pc_type asm` (`PCASM`) its iterations scale a bit better for more 1400 processes. You may also consider multigrid preconditioners like `PCMG` or BoomerAMG 1401 in `PCHYPRE`. 1402 1403(doc_faq_pipelined)= 1404 1405### What steps are necessary to make the pipelined solvers execute efficiently? 1406 1407Pipelined solvers like `KSPPGMRES`, `KSPPIPECG`, `KSPPIPECR`, and `KSPGROPPCG` may 1408require special MPI configuration to effectively overlap reductions with computation. In 1409general, this requires an MPI-3 implementation, an implementation that supports multiple 1410threads, and use of a "progress thread". Consult the documentation from your vendor or 1411computing facility for more details. 1412 1413- Cray MPI MPT-5.6 MPI-3, by setting `$MPICH_MAX_THREAD_SAFETY` to "multiple" 1414 for threads, plus either `$MPICH_ASYNC_PROGRESS` or 1415 `$MPICH_NEMESIS_ASYNC_PROGRESS`. E.g. 1416 ```console 1417 $ export MPICH_MAX_THREAD_SAFETY=multiple 1418 $ export MPICH_ASYNC_PROGRESS=1 1419 $ export MPICH_NEMESIS_ASYNC_PROGRESS=1 1420 ``` 1421 1422- MPICH version 3.0 and later implements the MPI-3 standard and the default 1423 configuration supports use of threads. Use of a progress thread is configured by 1424 setting the environment variable `$MPICH_ASYNC_PROGRESS`. E.g. 1425 ```console 1426 $ export MPICH_ASYNC_PROGRESS=1 1427 ``` 1428 1429### When using PETSc in single precision mode (`--with-precision=single` when running `configure`) are the operations done in single or double precision? 1430 1431PETSc does **NOT** do any explicit conversion of single precision to double before 1432performing computations; it depends on the hardware and compiler for what happens. For 1433example, the compiler could choose to put the single precision numbers into the usual 1434double precision registers and then use the usual double precision floating point unit. Or 1435it could use SSE2 instructions that work directly on the single precision numbers. It is a 1436bit of a mystery what decisions get made sometimes. There may be compiler flags in some 1437circumstances that can affect this. 1438 1439### Why is Newton's method (SNES) not converging, or converges slowly? 1440 1441Newton's method may not converge for many reasons, here are some of the most common: 1442 1443- The Jacobian is wrong (or correct in sequential but not in parallel). 1444- The linear system is {ref}`not solved <doc_faq_execution_kspconv>` or is not solved 1445 accurately enough. 1446- The Jacobian system has a singularity that the linear solver is not handling. 1447- There is a bug in the function evaluation routine. 1448- The function is not continuous or does not have continuous first derivatives (e.g. phase 1449 change or TVD limiters). 1450- The equations may not have a solution (e.g. limit cycle instead of a steady state) or 1451 there may be a "hill" between the initial guess and the steady state (e.g. reactants 1452 must ignite and burn before reaching a steady state, but the steady-state residual will 1453 be larger during combustion). 1454 1455Here are some of the ways to help debug lack of convergence of Newton: 1456 1457- Run on one processor to see if the problem is only in parallel. 1458 1459- Run with `-info` to get more detailed information on the solution process. 1460 1461- Run with the options 1462 1463 ```text 1464 -snes_monitor -ksp_monitor_true_residual -snes_converged_reason -ksp_converged_reason 1465 ``` 1466 1467 - If the linear solve does not converge, check if the Jacobian is correct, then see 1468 {ref}`this question <doc_faq_execution_kspconv>`. 1469 - If the preconditioned residual converges, but the true residual does not, the 1470 preconditioner may be singular. 1471 - If the linear solve converges well, but the line search fails, the Jacobian may be 1472 incorrect. 1473 1474- Run with `-pc_type lu` or `-pc_type svd` to see if the problem is a poor linear 1475 solver. 1476 1477- Run with `-mat_view` or `-mat_view draw` to see if the Jacobian looks reasonable. 1478 1479- Run with `-snes_test_jacobian -snes_test_jacobian_view` to see if the Jacobian you are 1480 using is wrong. Compare the output when you add `-mat_fd_type ds` to see if the result 1481 is sensitive to the choice of differencing parameter. 1482 1483- Run with `-snes_mf_operator -pc_type lu` to see if the Jacobian you are using is 1484 wrong. If the problem is too large for a direct solve, try 1485 1486 ```text 1487 -snes_mf_operator -pc_type ksp -ksp_ksp_rtol 1e-12. 1488 ``` 1489 1490 Compare the output when you add `-mat_mffd_type ds` to see if the result is sensitive 1491 to choice of differencing parameter. 1492 1493- Run with `-snes_linesearch_monitor` to see if the line search is failing (this is 1494 usually a sign of a bad Jacobian). Use `-info` in PETSc 3.1 and older versions, 1495 `-snes_ls_monitor` in PETSc 3.2 and `-snes_linesearch_monitor` in PETSc 3.3 and 1496 later. 1497 1498Here are some ways to help the Newton process if everything above checks out: 1499 1500- Run with grid sequencing (`-snes_grid_sequence` if working with a `DM` is all you 1501 need) to generate better initial guess on your finer mesh. 1502 1503- Run with quad precision, i.e. 1504 1505 ```console 1506 $ ./configure --with-precision=__float128 --download-f2cblaslapack 1507 ``` 1508 1509 :::{note} 1510 quad precision requires PETSc 3.2 and later and recent versions of the GNU compilers. 1511 ::: 1512 1513- Change the units (nondimensionalization), boundary condition scaling, or formulation so 1514 that the Jacobian is better conditioned. See [Buckingham pi theorem](https://en.wikipedia.org/wiki/Buckingham_%CF%80_theorem) and [Dimensional and 1515 Scaling Analysis](https://epubs.siam.org/doi/pdf/10.1137/16M1107127). 1516 1517- Mollify features in the function that do not have continuous first derivatives (often 1518 occurs when there are "if" statements in the residual evaluation, e.g. phase change or 1519 TVD limiters). Use a variational inequality solver (`SNESVINEWTONRSLS`) if the 1520 discontinuities are of fundamental importance. 1521 1522- Try a trust region method (`-ts_type tr`, may have to adjust parameters). 1523 1524- Run with some continuation parameter from a point where you know the solution, see 1525 `TSPSEUDO` for steady-states. 1526 1527- There are homotopy solver packages like PHCpack that can get you all possible solutions 1528 (and tell you that it has found them all) but those are not scalable and cannot solve 1529 anything but small problems. 1530 1531(doc_faq_execution_kspconv)= 1532 1533### Why is the linear solver (KSP) not converging, or converges slowly? 1534 1535:::{tip} 1536Always run with `-ksp_converged_reason -ksp_monitor_true_residual` when trying to 1537learn why a method is not converging! 1538::: 1539 1540Common reasons for KSP not converging are: 1541 1542- A symmetric method is being used for a non-symmetric problem. 1543 1544- The equations are singular by accident (e.g. forgot to impose boundary 1545 conditions). Check this for a small problem using `-pc_type svd -pc_svd_monitor`. 1546 1547- The equations are intentionally singular (e.g. constant null space), but the Krylov 1548 method was not informed, see `MatSetNullSpace()`. Always inform your local Krylov 1549 subspace solver of any change of singularity. Failure to do so will result in the 1550 immediate revocation of your computing and keyboard operator licenses, as well as 1551 a stern talking-to by the nearest Krylov Subspace Method representative. 1552 1553- The equations are intentionally singular and `MatSetNullSpace()` was used, but the 1554 right-hand side is not consistent. You may have to call `MatNullSpaceRemove()` on the 1555 right-hand side before calling `KSPSolve()`. See `MatSetTransposeNullSpace()`. 1556 1557- The equations are indefinite so that standard preconditioners don't work. Usually you 1558 will know this from the physics, but you can check with 1559 1560 ```text 1561 -ksp_compute_eigenvalues -ksp_gmres_restart 1000 -pc_type none 1562 ``` 1563 1564 For simple saddle point problems, try 1565 1566 ```text 1567 -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_detect_saddle_point 1568 ``` 1569 1570 For more difficult problems, read the literature to find robust methods and ask 1571 <mailto:petsc-users@mcs.anl.gov> or <mailto:petsc-maint@mcs.anl.gov> if you want advice about how to 1572 implement them. 1573 1574- If the method converges in preconditioned residual, but not in true residual, the 1575 preconditioner is likely singular or nearly so. This is common for saddle point problems 1576 (e.g. incompressible flow) or strongly nonsymmetric operators (e.g. low-Mach hyperbolic 1577 problems with large time steps). 1578 1579- The preconditioner is too weak or is unstable. See if `-pc_type asm -sub_pc_type lu` 1580 improves the convergence rate. If GMRES is losing too much progress in the restart, see 1581 if longer restarts help `-ksp_gmres_restart 300`. If a transpose is available, try 1582 `-ksp_type bcgs` or other methods that do not require a restart. 1583 1584 :::{note} 1585 Unfortunately convergence with these methods is frequently erratic. 1586 ::: 1587 1588- The preconditioner is nonlinear (e.g. a nested iterative solve), try `-ksp_type 1589 fgmres` or `-ksp_type gcr`. 1590 1591- You are using geometric multigrid, but some equations (often boundary conditions) are 1592 not scaled compatibly between levels. Try `-pc_mg_galerkin` both to algebraically 1593 construct a correctly scaled coarse operator or make sure that all the equations are 1594 scaled in the same way if you want to use rediscretized coarse levels. 1595 1596- The matrix is very ill-conditioned. Check the {ref}`condition number <doc_faq_usage_condnum>`. 1597 1598 - Try to improve it by choosing the relative scaling of components/boundary conditions. 1599 - Try `-ksp_diagonal_scale -ksp_diagonal_scale_fix`. 1600 - Perhaps change the formulation of the problem to produce more friendly algebraic 1601 equations. 1602 1603- Change the units (nondimensionalization), boundary condition scaling, or formulation so 1604 that the Jacobian is better conditioned. See [Buckingham pi theorem](https://en.wikipedia.org/wiki/Buckingham_%CF%80_theorem) and [Dimensional and 1605 Scaling Analysis](https://epubs.siam.org/doi/pdf/10.1137/16M1107127). 1606 1607- Classical Gram-Schmidt is becoming unstable, try `-ksp_gmres_modifiedgramschmidt` or 1608 use a method that orthogonalizes differently, e.g. `-ksp_type gcr`. 1609 1610### I get the error message: Actual argument at (1) to assumed-type dummy is of derived type with type-bound or FINAL procedures 1611 1612Use the following code-snippet: 1613 1614```fortran 1615module context_module 1616#include petsc/finclude/petsc.h 1617use petsc 1618implicit none 1619private 1620type, public :: context_type 1621 private 1622 PetscInt :: foo 1623contains 1624 procedure, public :: init => context_init 1625end type context_type 1626contains 1627subroutine context_init(self, foo) 1628 class(context_type), intent(in out) :: self 1629 PetscInt, intent(in) :: foo 1630 self%foo = foo 1631end subroutine context_init 1632end module context_module 1633 1634!------------------------------------------------------------------------ 1635 1636program test_snes 1637use,intrinsic :: iso_c_binding 1638use petsc 1639use context_module 1640implicit none 1641 1642SNES :: snes 1643type(context_type),target :: context 1644type(c_ptr) :: contextOut 1645PetscErrorCode :: ierr 1646 1647call PetscInitialize(PETSC_NULL_CHARACTER, ierr) 1648call SNESCreate(PETSC_COMM_WORLD, snes, ierr) 1649call context%init(1) 1650 1651contextOut = c_loc(context) ! contextOut is a C pointer on context 1652 1653call SNESSetConvergenceTest(snes, convergence, contextOut, PETSC_NULL_FUNCTION, ierr) 1654 1655call SNESDestroy(snes, ierr) 1656call PetscFinalize(ierr) 1657 1658contains 1659 1660subroutine convergence(snes, num_iterations, xnorm, pnorm,fnorm, reason, contextIn, ierr) 1661SNES, intent(in) :: snes 1662 1663PetscInt, intent(in) :: num_iterations 1664PetscReal, intent(in) :: xnorm, pnorm, fnorm 1665SNESConvergedReason, intent(out) :: reason 1666type(c_ptr), intent(in out) :: contextIn 1667type(context_type), pointer :: context 1668PetscErrorCode, intent(out) :: ierr 1669 1670call c_f_pointer(contextIn,context) ! convert the C pointer to a Fortran pointer to use context as in the main program 1671reason = 0 1672ierr = 0 1673end subroutine convergence 1674end program test_snes 1675``` 1676 1677### In C++ I get a crash on `VecDestroy()` (or some other PETSc object) at the end of the program 1678 1679This can happen when the destructor for a C++ class is automatically called at the end of 1680the program after `PetscFinalize()`. Use the following code-snippet: 1681 1682``` 1683main() 1684{ 1685 PetscCall(PetscInitialize()); 1686 { 1687 your variables 1688 your code 1689 1690 ... /* all your destructors are called here automatically by C++ so they work correctly */ 1691 } 1692 PetscCall(PetscFinalize()); 1693 return 0 1694} 1695``` 1696 1697______________________________________________________________________ 1698 1699## Debugging 1700 1701### What does the message hwloc/linux: Ignoring PCI device with non-16bit domain mean? 1702 1703This is printed by the hwloc library that is used by some MPI implementations. It can be ignored. 1704To prevent the message from always being printed set the environmental variable `HWLOC_HIDE_ERRORS` to 2. 1705For example 1706 1707``` 1708export HWLOC_HIDE_ERRORS=2 1709``` 1710 1711which can be added to your login profile file such as `~/.bashrc`. 1712 1713### How do I turn off PETSc signal handling so I can use the `-C` Option On `xlf`? 1714 1715Immediately after calling `PetscInitialize()` call `PetscPopSignalHandler()`. 1716 1717Some Fortran compilers including the IBM xlf, xlF etc compilers have a compile option 1718(`-C` for IBM's) that causes all array access in Fortran to be checked that they are 1719in-bounds. This is a great feature but does require that the array dimensions be set 1720explicitly, not with a \*. 1721 1722### How do I debug if `-start_in_debugger` does not work on my machine? 1723 1724The script <https://github.com/Azrael3000/tmpi> can be used to run multiple MPI 1725ranks in the debugger using tmux. 1726 1727On newer macOS machines - one has to be in admin group to be able to use debugger. 1728 1729On newer Ubuntu linux machines - one has to disable `ptrace_scope` with 1730 1731```console 1732$ sudo echo 0 > /proc/sys/kernel/yama/ptrace_scope 1733``` 1734 1735to get start in debugger working. 1736 1737If `-start_in_debugger` does not work on your OS, for a uniprocessor job, just 1738try the debugger directly, for example: `gdb ex1`. You can also use [TotalView](https://totalview.io/products/totalview) which is a good graphical parallel debugger. 1739 1740### How do I see where my code is hanging? 1741 1742You can use the `-start_in_debugger` option to start all processes in the debugger (each 1743will come up in its own xterm) or run in [TotalView](https://totalview.io/products/totalview). Then use `cont` (for continue) in each 1744xterm. Once you are sure that the program is hanging, hit control-c in each xterm and then 1745use 'where' to print a stack trace for each process. 1746 1747### How can I inspect PETSc vector and matrix values when in the debugger? 1748 1749I will illustrate this with `gdb`, but it should be similar on other debuggers. You can 1750look at local `Vec` values directly by obtaining the array. For a `Vec` v, we can 1751print all local values using: 1752 1753```console 1754(gdb) p ((Vec_Seq*) v->data)->array[0]@v->map.n 1755``` 1756 1757However, this becomes much more complicated for a matrix. Therefore, it is advisable to use the default viewer to look at the object. For a `Vec` v and a `Mat` m, this would be: 1758 1759```console 1760(gdb) call VecView(v, 0) 1761(gdb) call MatView(m, 0) 1762``` 1763 1764or with a communicator other than `MPI_COMM_WORLD`: 1765 1766```console 1767(gdb) call MatView(m, PETSC_VIEWER_STDOUT_(m->comm)) 1768``` 1769 1770Totalview 8.8.0+ has a new feature that allows libraries to provide their own code to 1771display objects in the debugger. Thus in theory each PETSc object, `Vec`, `Mat` etc 1772could have custom code to print values in the object. We have only done this for the most 1773elementary display of `Vec` and `Mat`. See the routine `TV_display_type()` in 1774`src/vec/vec/interface/vector.c` for an example of how these may be written. Contact us 1775if you would like to add more. 1776 1777### How can I find the cause of floating point exceptions like not-a-number (NaN) or infinity? 1778 1779The best way to locate floating point exceptions is to use a debugger. On supported 1780architectures (including Linux and glibc-based systems), just run in a debugger and pass 1781`-fp_trap` to the PETSc application. This will activate signaling exceptions and the 1782debugger will break on the line that first divides by zero or otherwise generates an 1783exceptions. 1784 1785Without a debugger, running with `-fp_trap` in debug mode will only identify the 1786function in which the error occurred, but not the line or the type of exception. If 1787`-fp_trap` is not supported on your architecture, consult the documentation for your 1788debugger since there is likely a way to have it catch exceptions. 1789 1790(error_libimf)= 1791 1792### Error while loading shared libraries: libimf.so: cannot open shared object file: No such file or directory 1793 1794The Intel compilers use shared libraries (like libimf) that cannot be found, by default, at run 1795time. When using the Intel compilers (and running the resulting code) you must make sure 1796that the proper Intel initialization scripts are run. This is usually done by adding some 1797code into your `.cshrc`, `.bashrc`, `.profile` etc file. Sometimes on batch file 1798systems that do now access your initialization files (like .cshrc) you must include the 1799initialization calls in your batch file submission. 1800 1801For example, on my Mac using `csh` I have the following in my `.cshrc` file: 1802 1803```csh 1804source /opt/intel/cc/10.1.012/bin/iccvars.csh 1805source /opt/intel/fc/10.1.012/bin/ifortvars.csh 1806source /opt/intel/idb/10.1.012/bin/idbvars.csh 1807``` 1808 1809And in my `.profile` I have 1810 1811```csh 1812source /opt/intel/cc/10.1.012/bin/iccvars.sh 1813source /opt/intel/fc/10.1.012/bin/ifortvars.sh 1814source /opt/intel/idb/10.1.012/bin/idbvars.sh 1815``` 1816 1817(object_type_not_set)= 1818 1819### What does "Object Type Not Set: Argument # N" Mean? 1820 1821Many operations on PETSc objects require that the specific type of the object be set before the operations is performed. You must call `XXXSetType()` or `XXXSetFromOptions()` before you make the offending call. For example 1822 1823``` 1824Mat A; 1825 1826PetscCall(MatCreate(PETSC_COMM_WORLD, &A)); 1827PetscCall(MatSetValues(A,....)); 1828``` 1829 1830will not work. You must add `MatSetType()` or `MatSetFromOptions()` before the call to `MatSetValues()`. I.e. 1831 1832``` 1833Mat A; 1834 1835PetscCall(MatCreate(PETSC_COMM_WORLD, &A)); 1836 1837PetscCall(MatSetType(A, MATAIJ)); 1838/* Will override MatSetType() */ 1839PetscCall(MatSetFromOptions()); 1840 1841PetscCall(MatSetValues(A,....)); 1842``` 1843 1844(split_ownership)= 1845 1846### What does error detected in `PetscSplitOwnership()` about "sum of local lengths ...": mean? 1847 1848In a previous call to `VecSetSizes()`, `MatSetSizes()`, `VecCreateXXX()` or 1849`MatCreateXXX()` you passed in local and global sizes that do not make sense for the 1850correct number of processors. For example if you pass in a local size of 2 and a global 1851size of 100 and run on two processors, this cannot work since the sum of the local sizes 1852is 4, not 100. 1853 1854(valgrind)= 1855 1856### What does "corrupt argument" or "caught signal" Or "SEGV" Or "segmentation violation" Or "bus error" mean? Can I use Valgrind or CUDA Compute Sanitizer to debug memory corruption issues? 1857 1858Sometimes it can mean an argument to a function is invalid. In Fortran this may be caused 1859by forgetting to list an argument in the call, especially the final `PetscErrorCode`. 1860 1861Otherwise it is usually caused by memory corruption; that is somewhere the code is writing 1862out of array bounds. To track this down rerun the debug version of the code with the 1863option `-malloc_debug`. Occasionally the code may crash only with the optimized version, 1864in that case run the optimized version with `-malloc_debug`. If you determine the 1865problem is from memory corruption you can put the macro CHKMEMQ in the code near the crash 1866to determine exactly what line is causing the problem. 1867 1868If `-malloc_debug` does not help: on NVIDIA CUDA systems you can use <https://docs.nvidia.com/compute-sanitizer/ComputeSanitizer/index.html>, 1869for example, `compute-sanitizer --tool memcheck [sanitizer_options] app_name [app_options]`. 1870 1871If `-malloc_debug` does not help: on GNU/Linux (not macOS machines) - you can 1872use [valgrind](http://valgrind.org). Follow the below instructions: 1873 18741. `configure` PETSc with `--download-mpich --with-debugging` (you can use other MPI implementations but most produce spurious Valgrind messages) 1875 18762. Compile your application code with this build of PETSc. 1877 18783. Run with Valgrind. 1879 1880 ```console 1881 $ $PETSC_DIR/lib/petsc/bin/petscmpiexec -valgrind -n NPROC PETSCPROGRAMNAME PROGRAMOPTIONS 1882 ``` 1883 1884 or 1885 1886 ```console 1887 $ mpiexec -n NPROC valgrind --tool=memcheck -q --num-callers=20 \ 1888 --suppressions=$PETSC_DIR/share/petsc/suppressions/valgrind \ 1889 --log-file=valgrind.log.%p PETSCPROGRAMNAME -malloc off PROGRAMOPTIONS 1890 ``` 1891 1892:::{note} 1893- option `--with-debugging` enables valgrind to give stack trace with additional 1894 source-file:line-number info. 1895- option `--download-mpich` is valgrind clean, other MPI builds are not valgrind clean. 1896- when `--download-mpich` is used - mpiexec will be in `$PETSC_ARCH/bin` 1897- `--log-file=valgrind.log.%p` option tells valgrind to store the output from each 1898 process in a different file \[as %p i.e PID, is different for each MPI process. 1899- `memcheck` will not find certain array access that violate static array 1900 declarations so if memcheck runs clean you can try the `--tool=exp-ptrcheck` 1901 instead. 1902::: 1903 1904You might also consider using <http://drmemory.org> which has support for GNU/Linux, Apple 1905Mac OS and Microsoft Windows machines. (Note we haven't tried this ourselves). 1906 1907(zeropivot)= 1908 1909### What does "detected zero pivot in LU factorization" mean? 1910 1911A zero pivot in LU, ILU, Cholesky, or ICC sparse factorization does not always mean that 1912the matrix is singular. You can use 1913 1914```text 1915-pc_factor_shift_type nonzero -pc_factor_shift_amount [amount] 1916``` 1917 1918or 1919 1920```text 1921-pc_factor_shift_type positive_definite -[level]_pc_factor_shift_type nonzero 1922 -pc_factor_shift_amount [amount] 1923``` 1924 1925or 1926 1927```text 1928-[level]_pc_factor_shift_type positive_definite 1929``` 1930 1931to prevent the zero pivot. [level] is "sub" when lu, ilu, cholesky, or icc are employed in 1932each individual block of the bjacobi or ASM preconditioner. [level] is "mg_levels" or 1933"mg_coarse" when lu, ilu, cholesky, or icc are used inside multigrid smoothers or to the 1934coarse grid solver. See `PCFactorSetShiftType()`, `PCFactorSetShiftAmount()`. 1935 1936This error can also happen if your matrix is singular, see `MatSetNullSpace()` for how 1937to handle this. If this error occurs in the zeroth row of the matrix, it is likely you 1938have an error in the code that generates the matrix. 1939 1940### You create draw windows or `PETSCVIEWERDRAW` windows or use options `-ksp_monitor draw::draw_lg` or `-snes_monitor draw::draw_lg` and the program seems to run OK but windows never open 1941 1942The libraries were compiled without support for X windows. Make sure that `configure` 1943was run with the option `--with-x`. 1944 1945### The program seems to use more and more memory as it runs, even though you don't think you are allocating more memory 1946 1947Some of the following may be occurring: 1948 1949- You are creating new PETSc objects but never freeing them. 1950- There is a memory leak in PETSc or your code. 1951- Something much more subtle: (if you are using Fortran). When you declare a large array 1952 in Fortran, the operating system does not allocate all the memory pages for that array 1953 until you start using the different locations in the array. Thus, in a code, if at each 1954 step you start using later values in the array your virtual memory usage will "continue" 1955 to increase as measured by `ps` or `top`. 1956- You are running with the `-log`, `-log_mpe`, or `-log_all` option. With these 1957 options, a great deal of logging information is stored in memory until the conclusion of 1958 the run. 1959- You are linking with the MPI profiling libraries; these cause logging of all MPI 1960 activities. Another symptom is at the conclusion of the run it may print some message 1961 about writing log files. 1962 1963The following may help: 1964 1965- Run with the `-malloc_debug` option and `-malloc_view`. Or use `PetscMallocDump()` 1966 and `PetscMallocView()` sprinkled about your code to track memory that is allocated 1967 and not later freed. Use the commands `PetscMallocGetCurrentUsage()` and 1968 `PetscMemoryGetCurrentUsage()` to monitor memory allocated and 1969 `PetscMallocGetMaximumUsage()` and `PetscMemoryGetMaximumUsage()` for total memory 1970 used as the code progresses. 1971- This is just the way Unix works and is harmless. 1972- Do not use the `-log`, `-log_mpe`, or `-log_all` option, or use 1973 `PetscLogEventDeactivate()` or `PetscLogEventDeactivateClass()` to turn off logging of 1974 specific events. 1975- Make sure you do not link with the MPI profiling libraries. 1976 1977### When calling `MatPartitioningApply()` you get a message "Error! Key 16615 Not Found" 1978 1979The graph of the matrix you are using is not symmetric. You must use symmetric matrices 1980for partitioning. 1981 1982### With GMRES at restart the second residual norm printed does not match the first 1983 1984I.e. 1985 1986```text 198726 KSP Residual norm 3.421544615851e-04 198827 KSP Residual norm 2.973675659493e-04 198928 KSP Residual norm 2.588642948270e-04 199029 KSP Residual norm 2.268190747349e-04 199130 KSP Residual norm 1.977245964368e-04 199230 KSP Residual norm 1.994426291979e-04 <----- At restart the residual norm is printed a second time 1993``` 1994 1995This is actually not surprising! GMRES computes the norm of the residual at each iteration 1996via a recurrence relation between the norms of the residuals at the previous iterations 1997and quantities computed at the current iteration. It does not compute it via directly 1998$|| b - A x^{n} ||$. 1999 2000Sometimes, especially with an ill-conditioned matrix, or computation of the matrix-vector 2001product via differencing, the residual norms computed by GMRES start to "drift" from the 2002correct values. At the restart, we compute the residual norm directly, hence the "strange 2003stuff," the difference printed. The drifting, if it remains small, is harmless (doesn't 2004affect the accuracy of the solution that GMRES computes). 2005 2006If you use a more powerful preconditioner the drift will often be smaller and less 2007noticeable. Of if you are running matrix-free you may need to tune the matrix-free 2008parameters. 2009 2010### Why do some Krylov methods seem to print two residual norms per iteration? 2011 2012I.e. 2013 2014```text 20151198 KSP Residual norm 1.366052062216e-04 20161198 KSP Residual norm 1.931875025549e-04 20171199 KSP Residual norm 1.366026406067e-04 20181199 KSP Residual norm 1.931819426344e-04 2019``` 2020 2021Some Krylov methods, for example `KSPTFQMR`, actually have a "sub-iteration" of size 2 2022inside the loop. Each of the two substeps has its own matrix vector product and 2023application of the preconditioner and updates the residual approximations. This is why you 2024get this "funny" output where it looks like there are two residual norms per 2025iteration. You can also think of it as twice as many iterations. 2026 2027### Unable to locate PETSc dynamic library `libpetsc` 2028 2029When using DYNAMIC libraries - the libraries cannot be moved after they are 2030installed. This could also happen on clusters - where the paths are different on the (run) 2031nodes - than on the (compile) front-end. **Do not use dynamic libraries & shared 2032libraries**. Run `configure` with 2033`--with-shared-libraries=0 --with-dynamic-loading=0`. 2034 2035:::{important} 2036This option has been removed in PETSc v3.5 2037::: 2038 2039### How do I determine what update to PETSc broke my code? 2040 2041If at some point (in PETSc code history) you had a working code - but the latest PETSc 2042code broke it, its possible to determine the PETSc code change that might have caused this 2043behavior. This is achieved by: 2044 2045- Using Git to access PETSc sources 2046- Knowing the Git commit for the known working version of PETSc 2047- Knowing the Git commit for the known broken version of PETSc 2048- Using the [bisect](https://mirrors.edge.kernel.org/pub/software/scm/git/docs/git-bisect.html) 2049 functionality of Git 2050 2051Git bisect can be done as follows: 2052 20531. Get PETSc development (main branch in git) sources 2054 2055 ```console 2056 $ git clone https://gitlab.com/petsc/petsc.git 2057 ``` 2058 20592. Find the good and bad markers to start the bisection process. This can be done either 2060 by checking `git log` or `gitk` or <https://gitlab.com/petsc/petsc> or the web 2061 history of petsc-release clones. Lets say the known bad commit is 21af4baa815c and 2062 known good commit is 5ae5ab319844. 2063 20643. Start the bisection process with these known revisions. Build PETSc, and test your code 2065 to confirm known good/bad behavior: 2066 2067 ```console 2068 $ git bisect start 21af4baa815c 5ae5ab319844 2069 ``` 2070 2071 build/test, perhaps discover that this new state is bad 2072 2073 ```console 2074 $ git bisect bad 2075 ``` 2076 2077 build/test, perhaps discover that this state is good 2078 2079 ```console 2080 $ git bisect good 2081 ``` 2082 2083 Now until done - keep bisecting, building PETSc, and testing your code with it and 2084 determine if the code is working or not. After something like 5-15 iterations, `git 2085 bisect` will pin-point the exact code change that resulted in the difference in 2086 application behavior. 2087 2088:::{tip} 2089See [git-bisect(1)](https://mirrors.edge.kernel.org/pub/software/scm/git/docs/git-bisect.html) and the 2090[debugging section of the Git Book](https://git-scm.com/book/en/Git-Tools-Debugging-with-Git) for more debugging tips. 2091::: 2092 2093### How to fix the error "PMIX Error: error in file gds_ds12_lock_pthread.c"? 2094 2095This seems to be an error when using Open MPI and OpenBLAS with threads (or perhaps other 2096packages that use threads). 2097 2098```console 2099$ export PMIX_MCA_gds=hash 2100``` 2101 2102Should resolve the problem. 2103 2104______________________________________________________________________ 2105 2106(doc_faq_sharedlibs)= 2107 2108## Shared Libraries 2109 2110### Can I install PETSc libraries as shared libraries? 2111 2112Yes. Use 2113 2114```console 2115$ ./configure --with-shared-libraries 2116``` 2117 2118### Why should I use shared libraries? 2119 2120When you link to shared libraries, the function symbols from the shared libraries are not 2121copied in the executable. This way the size of the executable is considerably smaller than 2122when using regular libraries. This helps in a couple of ways: 2123 2124- Saves disk space when more than one executable is created 2125- Improves the compile time immensely, because the compiler has to write a much smaller 2126 file (executable) to the disk. 2127 2128### How do I link to the PETSc shared libraries? 2129 2130By default, the compiler should pick up the shared libraries instead of the regular 2131ones. Nothing special should be done for this. 2132 2133### What if I want to link to the regular `.a` library files? 2134 2135You must run `configure` with the option `--with-shared-libraries=0` (you can use a 2136different `$PETSC_ARCH` for this build so you can easily switch between the two). 2137 2138### What do I do if I want to move my executable to a different machine? 2139 2140You would also need to have access to the shared libraries on this new machine. The other 2141alternative is to build the executable without shared libraries by first deleting the 2142shared libraries, and then creating the executable. 2143 2144```{bibliography} /petsc.bib 2145:filter: docname in docnames 2146``` 2147