1(doc_faq)= 2 3# FAQ 4 5```{contents} Table Of Contents 6:backlinks: top 7:local: true 8``` 9 10______________________________________________________________________ 11 12## General 13 14### How can I subscribe to the PETSc mailing lists? 15 16See mailing list {ref}`documentation <doc_mail>` 17 18### Any useful books on numerical computing? 19 20[Bueler, PETSc for Partial Differential Equations: Numerical Solutions in C and Python](https://my.siam.org/Store/Product/viewproduct/?ProductId=32850137) 21 22[Oliveira and Stewart, Writing Scientific Software: A Guide to Good Style](https://www.cambridge.org/core/books/writing-scientific-software/23206704175AF868E43FE3FB399C2F53) 23 24(doc_faq_general_parallel)= 25 26### What kind of parallel computers or clusters are needed to use PETSc? Or why do I get little speedup? 27 28:::{important} 29PETSc can be used with any kind of parallel system that supports MPI BUT for any decent 30performance one needs: 31 32- Fast, **low-latency** interconnect; any ethernet (even 10 GigE) simply cannot provide 33 the needed performance. 34- High per-core **memory** performance. Each core needs to 35 have its **own** memory bandwidth of at least 2 or more gigabytes/second. Most modern 36 computers are not bottlenecked by how fast they can perform 37 calculations; rather, they are usually restricted by how quickly they can get their 38 data. 39::: 40 41To obtain good performance it is important that you know your machine! I.e. how many 42compute nodes (generally, how many motherboards), how many memory sockets per node and how 43many cores per memory socket and how much memory bandwidth for each. 44 45If you do not know this and can run MPI programs with mpiexec (that is, you don't have 46batch system), run the following from `$PETSC_DIR`: 47 48```console 49$ make streams [NPMAX=maximum_number_of_mpi_processes_you_plan_to_use] 50``` 51 52This will provide a summary of the bandwidth with different number of MPI 53processes and potential speedups. See {any}`ch_streams` for a detailed discussion. 54 55If you have a batch system: 56 57```console 58$ cd $PETSC_DIR/src/benchmarks/streams 59$ make MPIVersion 60submit MPIVersion to the batch system a number of times with 1, 2, 3, etc. MPI processes 61collecting all of the output from the runs into the single file scaling.log. Copy 62scaling.log into the src/benchmarks/streams directory. 63$ ./process.py createfile ; process.py 64``` 65 66Even if you have enough memory bandwidth if the OS switches processes between cores 67performance can degrade. Smart process to core/socket binding (this just means locking a 68process to a particular core or memory socket) may help you. For example, consider using 69fewer processes than cores and binding processes to separate sockets so that each process 70uses a different memory bus: 71 72- [MPICH2 binding with the Hydra process manager](https://github.com/pmodels/mpich/blob/main/doc/wiki/how_to/Using_the_Hydra_Process_Manager.md#process-core-binding] 73 74 ```console 75 $ mpiexec.hydra -n 4 --binding cpu:sockets 76 ``` 77 78- [Open MPI binding](https://www.open-mpi.org/faq/?category=tuning#using-paffinity) 79 80 ```console 81 $ mpiexec -n 4 --map-by socket --bind-to socket --report-bindings 82 ``` 83 84- `taskset`, part of the [util-linux](https://github.com/karelzak/util-linux) package 85 86 Check `man taskset` for details. Make sure to set affinity for **your** program, 87 **not** for the `mpiexec` program. 88 89- `numactl` 90 91 In addition to task affinity, this tool also allows changing the default memory affinity 92 policy. On Linux, the default policy is to attempt to find memory on the same memory bus 93 that serves the core that a thread is running on when the memory is faulted 94 (not when `malloc()` is called). If local memory is not available, it is found 95 elsewhere, possibly leading to serious memory imbalances. 96 97 The option `--localalloc` allocates memory on the local NUMA node, similar to the 98 `numa_alloc_local()` function in the `libnuma` library. The option 99 `--cpunodebind=nodes` binds the process to a given NUMA node (note that this can be 100 larger or smaller than a CPU (socket); a NUMA node usually has multiple cores). 101 102 The option `--physcpubind=cpus` binds the process to a given processor core (numbered 103 according to `/proc/cpuinfo`, therefore including logical cores if Hyper-threading is 104 enabled). 105 106 With Open MPI, you can use knowledge of the NUMA hierarchy and core numbering on your 107 machine to calculate the correct NUMA node or processor number given the environment 108 variable `OMPI_COMM_WORLD_LOCAL_RANK`. In most cases, it is easier to make mpiexec or 109 a resource manager set affinities. 110 111The software [Open-MX](http://open-mx.gforge.inria.fr) provides faster speed for 112ethernet systems, we have not tried it but it claims it can dramatically reduce latency 113and increase bandwidth on Linux system. You must first install this software and then 114install MPICH or Open MPI to use it. 115 116### What kind of license is PETSc released under? 117 118See licensing {ref}`documentation <doc_license>` 119 120### Why is PETSc written in C, instead of Fortran or C++? 121 122When this decision was made, in the early 1990s, C enabled us to build data structures 123for storing sparse matrices, solver information, 124etc. in ways that Fortran simply did not allow. ANSI C was a complete standard that all 125modern C compilers supported. The language was identical on all machines. C++ was still 126evolving and compilers on different machines were not identical. Using C function pointers 127to provide data encapsulation and polymorphism allowed us to get many of the advantages of 128C++ without using such a large and more complicated language. It would have been natural and 129reasonable to have coded PETSc in C++; we opted to use C instead. 130 131### Does all the PETSc error checking and logging reduce PETSc's efficiency? 132 133No 134 135(doc_faq_maintenance_strats)= 136 137### How do such a small group of people manage to write and maintain such a large and marvelous package as PETSc? 138 1391. **We work very efficiently**. 140 141 - We use powerful editors and programming environments. 142 - Our manual pages are generated automatically from formatted comments in the code, 143 thus alleviating the need for creating and maintaining manual pages. 144 - We employ continuous integration testing of the entire PETSc library on many different 145 machine architectures. This process **significantly** protects (no bug-catching 146 process is perfect) against inadvertently introducing bugs with new additions. Every 147 new feature **must** pass our suite of thousands of tests as well as formal code 148 review before it may be included. 149 1502. **We are very careful in our design (and are constantly revising our design)** 151 152 - PETSc as a package should be easy to use, write, and maintain. Our mantra is to write 153 code like everyone is using it. 154 1553. **We are willing to do the grunt work** 156 157 - PETSc is regularly checked to make sure that all code conforms to our interface 158 design. We will never keep in a bad design decision simply because changing it will 159 require a lot of editing; we do a lot of editing. 160 1614. **We constantly seek out and experiment with new design ideas** 162 163 - We retain the useful ones and discard the rest. All of these decisions are based not 164 just on performance, but also on **practicality**. 165 1665. **Function and variable names must adhere to strict guidelines** 167 168 - Even the rules about capitalization are designed to make it easy to figure out the 169 name of a particular object or routine. Our memories are terrible, so careful 170 consistent naming puts less stress on our limited human RAM. 171 1726. **The PETSc directory tree is designed to make it easy to move throughout the 173 entire package** 174 1757. **We have a rich, robust, and fast bug reporting system** 176 177 - <mailto:petsc-maint@mcs.anl.gov> is always checked, and we pride ourselves on responding 178 quickly and accurately. Email is very lightweight, and so bug reports system retains 179 an archive of all reported problems and fixes, so it is easy to re-find fixes to 180 previously discovered problems. 181 1828. **We contain the complexity of PETSc by using powerful object-oriented programming 183 techniques** 184 185 - Data encapsulation serves to abstract complex data formats or movement to 186 human-readable format. This is why your program cannot, for example, look directly 187 at what is inside the object `Mat`. 188 - Polymorphism makes changing program behavior as easy as possible, and further 189 abstracts the *intent* of your program from what is *written* in code. You call 190 `MatMult()` regardless of whether your matrix is dense, sparse, parallel or 191 sequential; you don't call a different routine for each format. 192 1939. **We try to provide the functionality requested by our users** 194 195### For complex numbers will I get better performance with C++? 196 197To use PETSc with complex numbers you may use the following `configure` options; 198`--with-scalar-type=complex` and either `--with-clanguage=c++` or (the default) 199`--with-clanguage=c`. In our experience they will deliver very similar performance 200(speed), but if one is concerned they should just try both and see if one is faster. 201 202### How come when I run the same program on the same number of processes I get a "different" answer? 203 204Inner products and norms in PETSc are computed using the `MPI_Allreduce()` command. For 205different runs the order at which values arrive at a given process (via MPI) can be in a 206different order, thus the order in which some floating point arithmetic operations are 207performed will be different. Since floating point arithmetic is not 208associative, the computed quantity may be slightly different. 209 210Over a run the many slight differences in the inner products and norms will effect all the 211computed results. It is important to realize that none of the computed answers are any 212less right or wrong (in fact the sequential computation is no more right then the parallel 213ones). All answers are equal, but some are more equal than others. 214 215The discussion above assumes that the exact same algorithm is being used on the different 216number of processes. When the algorithm is different for the different number of processes 217(almost all preconditioner algorithms except Jacobi are different for different number of 218processes) then one expects to see (and does) a greater difference in results for 219different numbers of processes. In some cases (for example block Jacobi preconditioner) it 220may be that the algorithm works for some number of processes and does not work for others. 221 222### How come when I run the same linear solver on a different number of processes it takes a different number of iterations? 223 224The convergence of many of the preconditioners in PETSc including the default parallel 225preconditioner block Jacobi depends on the number of processes. The more processes the 226(slightly) slower convergence it has. This is the nature of iterative solvers, the more 227parallelism means the more "older" information is used in the solution process hence 228slower convergence. 229 230(doc_faq_gpuhowto)= 231 232### Can PETSc use GPUs to speed up computations? 233 234:::{seealso} 235See GPU development {ref}`roadmap <doc_gpu_roadmap>` for the latest information 236regarding the state of PETSc GPU integration. 237 238See GPU install {ref}`documentation <doc_config_accel>` for up-to-date information on 239installing PETSc to use GPU's. 240::: 241 242Quick summary of usage with CUDA: 243 244- The `VecType` `VECSEQCUDA`, `VECMPICUDA`, or `VECCUDA` may be used with 245 `VecSetType()` or `-vec_type seqcuda`, `mpicuda`, or `cuda` when 246 `VecSetFromOptions()` is used. 247- The `MatType` `MATSEQAIJCUSPARSE`, `MATMPIAIJCUSPARSE`, or `MATAIJCUSPARSE` 248 may be used with `MatSetType()` or `-mat_type seqaijcusparse`, `mpiaijcusparse`, or 249 `aijcusparse` when `MatSetFromOptions()` is used. 250- If you are creating the vectors and matrices with a `DM`, you can use `-dm_vec_type 251 cuda` and `-dm_mat_type aijcusparse`. 252 253Quick summary of usage with OpenCL (provided by the ViennaCL library): 254 255- The `VecType` `VECSEQVIENNACL`, `VECMPIVIENNACL`, or `VECVIENNACL` may be used 256 with `VecSetType()` or `-vec_type seqviennacl`, `mpiviennacl`, or `viennacl` 257 when `VecSetFromOptions()` is used. 258- The `MatType` `MATSEQAIJVIENNACL`, `MATMPIAIJVIENNACL`, or `MATAIJVIENNACL` 259 may be used with `MatSetType()` or `-mat_type seqaijviennacl`, `mpiaijviennacl`, or 260 `aijviennacl` when `MatSetFromOptions()` is used. 261- If you are creating the vectors and matrices with a `DM`, you can use `-dm_vec_type 262 viennacl` and `-dm_mat_type aijviennacl`. 263 264General hints: 265 266- It is useful to develop your code with the default vectors and then run production runs 267 with the command line options to use the GPU since debugging on GPUs is difficult. 268- All of the Krylov methods except `KSPIBCGS` run on the GPU. 269- Parts of most preconditioners run directly on the GPU. After setup, `PCGAMG` runs 270 fully on GPUs, without any memory copies between the CPU and GPU. 271 272Some GPU systems (for example many laptops) only run with single precision; thus, PETSc 273must be built with the `configure` option `--with-precision=single`. 274 275(doc_faq_extendedprecision)= 276 277### Can I run PETSc with extended precision? 278 279Yes, with gcc and gfortran. `configure` PETSc using the 280options `--with-precision=__float128` and `--download-f2cblaslapack`. 281 282:::{admonition} Warning 283:class: yellow 284 285External packages are not guaranteed to work in this mode! 286::: 287 288### Why doesn't PETSc use Qd to implement support for extended precision? 289 290We tried really hard but could not. The problem is that the QD c++ classes, though they 291try to, implement the built-in data types of `double` are not native types and cannot 292"just be used" in a general piece of numerical source code. Ratherm the code has to 293rewritten to live within the limitations of QD classes. However PETSc can be built to use 294quad precision, as detailed {ref}`here <doc_faq_extendedprecision>`. 295 296### How do I cite PETSc? 297 298Use {any}`these citations <doc_index_citing_petsc>`. 299 300______________________________________________________________________ 301 302## Installation 303 304### How do I begin using PETSc if the software has already been completely built and installed by someone else? 305 306Assuming that the PETSc libraries have been successfully built for a particular 307architecture and level of optimization, a new user must merely: 308 3091. Set `PETSC_DIR` to the full path of the PETSc home 310 directory. This will be the location of the `configure` script, and usually called 311 "petsc" or some variation of that (for example, /home/username/petsc). 3122. Set `PETSC_ARCH`, which indicates the configuration on which PETSc will be 313 used. Note that `$PETSC_ARCH` is simply a name the installer used when installing 314 the libraries. There will exist a directory within `$PETSC_DIR` that is named after 315 its corresponding `$PETSC_ARCH`. There many be several on a single system, for 316 example "linux-c-debug" for the debug versions compiled by a C compiler or 317 "linux-c-opt" for the optimized version. 318 319:::{admonition} Still Stuck? 320See the {ref}`quick-start tutorial <tut_install>` for a step-by-step guide on 321installing PETSc, in case you have missed a step. 322 323See the users manual section on {ref}`getting started <sec_getting_started>`. 324::: 325 326### The PETSc distribution is SO Large. How can I reduce my disk space usage? 327 3281. The PETSc users manual is provided in PDF format at `$PETSC_DIR/manual.pdf`. You 329 can delete this. 3302. The PETSc test suite contains sample output for many of the examples. These are 331 contained in the PETSc directories `$PETSC_DIR/src/*/tutorials/output` and 332 `$PETSC_DIR/src/*/tests/output`. Once you have run the test examples, you may remove 333 all of these directories to save some disk space. You can locate the largest with 334 e.g. `find . -name output -type d | xargs du -sh | sort -hr` on a Unix-based system. 3353. The debugging versions of the libraries are larger than the optimized versions. In a 336 pinch you can work with the optimized version, although we bid you good luck in 337 finding bugs as it is much easier with the debug version. 338 339### I want to use PETSc only for uniprocessor programs. Must I still install and use a version of MPI? 340 341No, run `configure` with the option `--with-mpi=0` 342 343### Can I install PETSc to not use X windows (either under Unix or Microsoft Windows with GCC)? 344 345Yes. Run `configure` with the additional flag `--with-x=0` 346 347### Why do you use MPI? 348 349MPI is the message-passing standard. Because it is a standard, it will not frequently change over 350time; thus, we do not have to change PETSc every time the provider of the message-passing 351system decides to make an interface change. MPI was carefully designed by experts from 352industry, academia, and government labs to provide the highest quality performance and 353capability. 354 355For example, the careful design of communicators in MPI allows the easy nesting of 356different libraries; no other message-passing system provides this support. All of the 357major parallel computer vendors were involved in the design of MPI and have committed to 358providing quality implementations. 359 360In addition, since MPI is a standard, several different groups have already provided 361complete free implementations. Thus, one does not have to rely on the technical skills of 362one particular group to provide the message-passing libraries. Today, MPI is the only 363practical, portable approach to writing efficient parallel numerical software. 364 365(invalid_mpi_compilers)= 366 367### What do I do if my MPI compiler wrappers are invalid? 368 369Most MPI implementations provide compiler wrappers (such as `mpicc`) which give the 370include and link options necessary to use that version of MPI to the underlying compilers 371. Configuration will fail if these wrappers are either absent or broken in the MPI pointed to by 372`--with-mpi-dir`. You can rerun the configure with the additional option 373`--with-mpi-compilers=0`, which will try to auto-detect working compilers; however, 374these compilers may be incompatible with the particular MPI build. If this fix does not 375work, run with `--with-cc=[your_c_compiler]` where you know `your_c_compiler` works 376with this particular MPI, and likewise for C++ (`--with-cxx=[your_cxx_compiler]`) and Fortran 377(`--with-fc=[your_fortran_compiler]`). 378 379(bit_indices)= 380 381### When should/can I use the `configure` option `--with-64-bit-indices`? 382 383By default the type that PETSc uses to index into arrays and keep sizes of arrays is a 384`PetscInt` defined to be a 32-bit `int`. If your problem: 385 386- Involves more than 2^31 - 1 unknowns (around 2 billion). 387- Your matrix might contain more than 2^31 - 1 nonzeros on a single process. 388 389Then you need to use this option. Otherwise you will get strange crashes. 390 391This option can be used when you are using either 32 or 64-bit pointers. You do not 392need to use this option if you are using 64-bit pointers unless the two conditions above 393hold. 394 395### What if I get an internal compiler error? 396 397You can rebuild the offending file individually with a lower optimization level. **Then 398make sure to complain to the compiler vendor and file a bug report**. For example, if the 399compiler chokes on `src/mat/impls/baij/seq/baijsolvtrannat.c` you can run the following 400from `$PETSC_DIR`: 401 402```console 403$ make -f gmakefile PCC_FLAGS="-O1" $PETSC_ARCH/obj/src/mat/impls/baij/seq/baijsolvtrannat.o 404$ make all 405``` 406 407### How do I enable Python bindings (petsc4py) with PETSc? 408 4091. Install [Cython](https://cython.org/). 4102. `configure` PETSc with the `--with-petsc4py=1` option. 4113. set `PYTHONPATH=$PETSC_DIR/$PETSC_ARCH/lib` 412 413(macos_gfortran)= 414 415### What Fortran compiler do you recommend on macOS? 416 417We recommend using [homebrew](https://brew.sh/) to install [gfortran](https://gcc.gnu.org/wiki/GFortran), see {any}`doc_macos_install` 418 419### How can I find the URL locations of the packages you install using `--download-PACKAGE`? 420 421```console 422$ grep "self.download " $PETSC_DIR/config/BuildSystem/config/packages/*.py 423``` 424 425### How to fix the problem: PETSc was configured with one MPICH (or Open MPI) `mpi.h` version but now appears to be compiling using a different MPICH (or Open MPI) `mpi.h` version 426 427This happens for generally one of two reasons: 428 429- You previously ran `configure` with the option `--download-mpich` (or `--download-openmpi`) 430 but later ran `configure` to use a version of MPI already installed on the 431 machine. Solution: 432 433 ```console 434 $ rm -rf $PETSC_DIR/$PETSC_ARCH 435 $ ./configure --your-args 436 ``` 437 438(mpi_network_misconfigure)= 439 440### What does it mean when `make check` hangs or errors on `PetscOptionsInsertFile()`? 441 442For example: 443 444```none 445Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes 446See https://petsc.org/release/faq/ 447[0]PETSC ERROR: #1 PetscOptionsInsertFile() line 563 in /Users/barrysmith/Src/PETSc/src/sys/objects/options.c 448[0]PETSC ERROR: #2 PetscOptionsInsert() line 720 in /Users/barrysmith/Src/PETSc/src/sys/objects/options.c 449[0]PETSC ERROR: #3 PetscInitialize() line 828 in /Users/barrysmith/Src/PETSc/src/sys/objects/pinit.c 450``` 451 452or 453 454```none 455$ make check 456Running check examples to verify correct installation 457Using PETSC_DIR=/Users/barrysmith/Src/petsc and PETSC_ARCH=arch-fix-mpiexec-hang-2-ranks 458C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process 459PROGRAM SEEMS TO BE HANGING HERE 460``` 461 462This usually occurs when network settings are misconfigured (perhaps due to VPN) resulting in a failure or hang in system call `gethostbyname()`. 463 464- Verify you are using the correct `mpiexec` for the MPI you have linked PETSc with. 465 466- If you have a VPN enabled on your machine, try turning it off and then running `make check` to 467 verify that it is not the VPN playing poorly with MPI. 468 469- If ``ping `hostname` `` (`/sbin/ping` on macOS) fails or hangs do: 470 471 ```none 472 echo 127.0.0.1 `hostname` | sudo tee -a /etc/hosts 473 ``` 474 475 and try `make check` again. 476 477- Try completely disconnecting your machine from the network and see if `make check` then works 478 479- Try the PETSc `configure` option `--download-mpich-device=ch3:nemesis` with `--download-mpich`. 480 481______________________________________________________________________ 482 483## Usage 484 485### How can I redirect PETSc's `stdout` and `stderr` when programming with a GUI interface in Windows Developer Studio or to C++ streams? 486 487To overload just the error messages write your own `MyPrintError()` function that does 488whatever you want (including pop up windows etc) and use it like below. 489 490```c 491extern "C" { 492 int PASCAL WinMain(HINSTANCE,HINSTANCE,LPSTR,int); 493}; 494 495#include <petscsys.h> 496#include <mpi.h> 497 498const char help[] = "Set up from main"; 499 500int MyPrintError(const char error[], ...) 501{ 502 printf("%s", error); 503 return 0; 504} 505 506int main(int ac, char *av[]) 507{ 508 char buf[256]; 509 HINSTANCE inst; 510 PetscErrorCode ierr; 511 512 inst = (HINSTANCE)GetModuleHandle(NULL); 513 PetscErrorPrintf = MyPrintError; 514 515 buf[0] = 0; 516 for (int i = 1; i < ac; ++i) { 517 strcat(buf, av[i]); 518 strcat(buf, " "); 519 } 520 521 PetscCall(PetscInitialize(&ac, &av, NULL, help)); 522 523 return WinMain(inst, NULL, buf, SW_SHOWNORMAL); 524} 525``` 526 527Place this file in the project and compile with this preprocessor definitions: 528 529``` 530WIN32 531_DEBUG 532_CONSOLE 533_MBCS 534USE_PETSC_LOG 535USE_PETSC_BOPT_g 536USE_PETSC_STACK 537_AFXDLL 538``` 539 540And these link options: 541 542``` 543/nologo 544/subsystem:console 545/incremental:yes 546/debug 547/machine:I386 548/nodefaultlib:"libcmtd.lib" 549/nodefaultlib:"libcd.lib" 550/nodefaultlib:"mvcrt.lib" 551/pdbtype:sept 552``` 553 554:::{note} 555The above is compiled and linked as if it was a console program. The linker will search 556for a main, and then from it the `WinMain` will start. This works with MFC templates and 557derived classes too. 558 559When writing a Window's console application you do not need to do anything, the `stdout` 560and `stderr` is automatically output to the console window. 561::: 562 563To change where all PETSc `stdout` and `stderr` go, (you can also reassign 564`PetscVFPrintf()` to handle `stdout` and `stderr` any way you like) write the 565following function: 566 567``` 568PetscErrorCode mypetscvfprintf(FILE *fd, const char format[], va_list Argp) 569{ 570 PetscFunctionBegin; 571 if (fd != stdout && fd != stderr) { /* handle regular files */ 572 PetscCall(PetscVFPrintfDefault(fd, format, Argp)); 573 } else { 574 char buff[1024]; /* Make sure to assign a large enough buffer */ 575 int length; 576 577 PetscCall(PetscVSNPrintf(buff, 1024, format, &length, Argp)); 578 579 /* now send buff to whatever stream or whatever you want */ 580 } 581 PetscFunctionReturn(PETSC_SUCCESS); 582} 583``` 584 585Then assign `PetscVFPrintf = mypetscprintf` before `PetscInitialize()` in your main 586program. 587 588### I want to use Hypre boomerAMG without GMRES but when I run `-pc_type hypre -pc_hypre_type boomeramg -ksp_type preonly` I don't get a very accurate answer! 589 590You should run with `-ksp_type richardson` to have PETSc run several V or W 591cycles. `-ksp_type preonly` causes boomerAMG to use only one V/W cycle. You can control 592how many cycles are used in a single application of the boomerAMG preconditioner with 593`-pc_hypre_boomeramg_max_iter <it>` (the default is 1). You can also control the 594tolerance boomerAMG uses to decide if to stop before `max_iter` with 595`-pc_hypre_boomeramg_tol <tol>` (the default is 1.e-7). Run with `-ksp_view` to see 596all the hypre options used and `-help | grep boomeramg` to see all the command line 597options. 598 599### How do I use PETSc for Domain Decomposition? 600 601PETSc includes Additive Schwarz methods in the suite of preconditioners under the umbrella 602of `PCASM`. These may be activated with the runtime option `-pc_type asm`. Various 603other options may be set, including the degree of overlap `-pc_asm_overlap <number>` the 604type of restriction/extension `-pc_asm_type [basic,restrict,interpolate,none]` sets ASM 605type and several others. You may see the available ASM options by using `-pc_type asm 606-help`. See the procedural interfaces in the manual pages, for example `PCASMType()` 607and check the index of the users manual for `PCASMCreateSubdomains()`. 608 609PETSc also contains a domain decomposition inspired wirebasket or face based two level 610method where the coarse mesh to fine mesh interpolation is defined by solving specific 611local subdomain problems. It currently only works for 3D scalar problems on structured 612grids created with PETSc `DMDA`. See the manual page for `PCEXOTIC` and 613`src/ksp/ksp/tutorials/ex45.c` for an example. 614 615PETSc also contains a balancing Neumann-Neumann type preconditioner, see the manual page 616for `PCBDDC`. This requires matrices be constructed with `MatCreateIS()` via the finite 617element method. See `src/ksp/ksp/tests/ex59.c` for an example. 618 619### You have `MATAIJ` and `MATBAIJ` matrix formats, and `MATSBAIJ` for symmetric storage, how come no `MATSAIJ`? 620 621Just for historical reasons; the `MATSBAIJ` format with blocksize one is just as efficient as 622a `MATSAIJ` would be. 623 624### Can I Create BAIJ matrices with different size blocks for different block rows? 625 626No. The `MATBAIJ` format only supports a single fixed block size on the entire 627matrix. But the `MATAIJ` format automatically searches for matching rows and thus still 628takes advantage of the natural blocks in your matrix to obtain good performance. 629 630:::{note} 631If you use `MATAIJ` you cannot use the `MatSetValuesBlocked()`. 632::: 633 634### How do I access the values of a remote parallel PETSc Vec? 635 6361. On each process create a local `Vec` large enough to hold all the values it wishes to 637 access. 6382. Create a `VecScatter` that scatters from the parallel `Vec` into the local `Vec`. 6393. Use `VecGetArray()` to access the values in the local `Vec`. 640 641For example, assuming we have distributed a vector `vecGlobal` of size $N$ to 642$R$ ranks and each remote rank holds $N/R = m$ values (similarly assume that 643$N$ is cleanly divisible by $R$). We want each rank $r$ to gather the 644first $n$ (also assume $n \leq m$) values from its immediately superior neighbor 645$r+1$ (final rank will retrieve from rank 0). 646 647``` 648Vec vecLocal; 649IS isLocal, isGlobal; 650VecScatter ctx; 651PetscScalar *arr; 652PetscInt N, firstGlobalIndex; 653MPI_Comm comm; 654PetscMPIInt r, R; 655 656/* Create sequential local vector, big enough to hold local portion */ 657PetscCall(VecCreateSeq(PETSC_COMM_SELF, n, &vecLocal)); 658 659/* Create IS to describe where we want to scatter to */ 660PetscCall(ISCreateStride(PETSC_COMM_SELF, n, 0, 1, &isLocal)); 661 662/* Compute the global indices */ 663PetscCall(VecGetSize(vecGlobal, &N)); 664PetscCall(PetscObjectGetComm((PetscObject) vecGlobal, &comm)); 665PetscCallMPI(MPI_Comm_rank(comm, &r)); 666PetscCallMPI(MPI_Comm_size(comm, &R)); 667firstGlobalIndex = r == R-1 ? 0 : (N/R)*(r+1); 668 669/* Create IS that describes where we want to scatter from */ 670PetscCall(ISCreateStride(comm, n, firstGlobalIndex, 1, &isGlobal)); 671 672/* Create the VecScatter context */ 673PetscCall(VecScatterCreate(vecGlobal, isGlobal, vecLocal, isLocal, &ctx)); 674 675/* Gather the values */ 676PetscCall(VecScatterBegin(ctx, vecGlobal, vecLocal, INSERT_VALUES, SCATTER_FORWARD)); 677PetscCall(VecScatterEnd(ctx, vecGlobal, vecLocal, INSERT_VALUES, SCATTER_FORWARD)); 678 679/* Retrieve and do work */ 680PetscCall(VecGetArray(vecLocal, &arr)); 681/* Work */ 682PetscCall(VecRestoreArray(vecLocal, &arr)); 683 684/* Don't forget to clean up */ 685PetscCall(ISDestroy(&isLocal)); 686PetscCall(ISDestroy(&isGlobal)); 687PetscCall(VecScatterDestroy(&ctx)); 688PetscCall(VecDestroy(&vecLocal)); 689``` 690 691(doc_faq_usage_alltoone)= 692 693### How do I collect to a single processor all the values from a parallel PETSc Vec? 694 6951. Create the `VecScatter` context that will do the communication: 696 697 ``` 698 Vec in_par,out_seq; 699 VecScatter ctx; 700 701 PetscCall(VecScatterCreateToAll(in_par, &ctx, &out_seq)); 702 ``` 703 7042. Initiate the communication (this may be repeated if you wish): 705 706 ``` 707 PetscCall(VecScatterBegin(ctx, in_par, out_seq, INSERT_VALUES, SCATTER_FORWARD)); 708 PetscCall(VecScatterEnd(ctx, in_par, out_seq, INSERT_VALUES, SCATTER_FORWARD)); 709 /* May destroy context now if no additional scatters are needed, otherwise reuse it */ 710 PetscCall(VecScatterDestroy(&ctx)); 711 ``` 712 713Note that this simply concatenates in the parallel ordering of the vector (computed by the 714`MPI_Comm_rank` of the owning process). If you are using a `Vec` from 715`DMCreateGlobalVector()` you likely want to first call `DMDAGlobalToNaturalBegin()` 716followed by `DMDAGlobalToNaturalEnd()` to scatter the original `Vec` into the natural 717ordering in a new global `Vec` before calling `VecScatterBegin()`/`VecScatterEnd()` 718to scatter the natural `Vec` onto all processes. 719 720### How do I collect all the values from a parallel PETSc Vec on the 0th rank? 721 722See FAQ entry on collecting to {ref}`an arbitrary processor <doc_faq_usage_alltoone>`, but 723replace 724 725``` 726PetscCall(VecScatterCreateToAll(in_par, &ctx, &out_seq)); 727``` 728 729with 730 731``` 732PetscCall(VecScatterCreateToZero(in_par, &ctx, &out_seq)); 733``` 734 735:::{note} 736The same ordering considerations as discussed in the aforementioned entry also apply 737here. 738::: 739 740### How can I read in or write out a sparse matrix in Matrix Market, Harwell-Boeing, Slapc or other ASCII format? 741 742If you can read or write your matrix using Python or MATLAB/Octave, `PetscBinaryIO` 743modules are provided at `$PETSC_DIR/lib/petsc/bin` for each language that can assist 744with reading and writing. If you just want to convert `MatrixMarket`, you can use: 745 746```console 747$ python -m $PETSC_DIR/lib/petsc/bin/PetscBinaryIO convert matrix.mtx 748``` 749 750To produce `matrix.petsc`. 751 752You can also call the script directly or import it from your Python code. There is also a 753`PETScBinaryIO.jl` Julia package. 754 755For other formats, either adapt one of the above libraries or see the examples in 756`$PETSC_DIR/src/mat/tests`, specifically `ex72.c` or `ex78.c`. You will likely need 757to modify the code slightly to match your required ASCII format. 758 759:::{note} 760Never read or write in parallel an ASCII matrix file. 761 762Instead read in sequentially with a standalone code based on `ex72.c` or `ex78.c` 763then save the matrix with the binary viewer `PetscViewerBinaryOpen()` and load the 764matrix in parallel in your "real" PETSc program with `MatLoad()`. 765 766For writing save with the binary viewer and then load with the sequential code to store 767it as ASCII. 768::: 769 770### Does TSSetFromOptions(), SNESSetFromOptions(), or KSPSetFromOptions() reset all the parameters I previously set or how come do they not seem to work? 771 772If `XXSetFromOptions()` is used (with `-xxx_type aaaa`) to change the type of the 773object then all parameters associated with the previous type are removed. Otherwise it 774does not reset parameters. 775 776`TS/SNES/KSPSetXXX()` commands that set properties for a particular type of object (such 777as `KSPGMRESSetRestart()`) ONLY work if the object is ALREADY of that type. For example, 778with 779 780``` 781KSP ksp; 782 783PetscCall(KSPCreate(PETSC_COMM_WORLD, &ksp)); 784PetscCall(KSPGMRESSetRestart(ksp, 10)); 785``` 786 787the restart will be ignored since the type has not yet been set to `KSPGMRES`. To have 788those values take effect you should do one of the following: 789 790- Allow setting the type from the command line, if it is not on the command line then the 791 default type is automatically set. 792 793``` 794/* Create generic object */ 795XXXCreate(..,&obj); 796/* Must set all settings here, or default */ 797XXXSetFromOptions(obj); 798``` 799 800- Hardwire the type in the code, but allow the user to override it via a subsequent 801 `XXXSetFromOptions()` call. This essentially allows the user to customize what the 802 "default" type to of the object. 803 804``` 805/* Create generic object */ 806XXXCreate(..,&obj); 807/* Set type directly */ 808XXXSetYYYYY(obj,...); 809/* Can always change to different type */ 810XXXSetFromOptions(obj); 811``` 812 813### How do I compile and link my own PETSc application codes and can I use my own `makefile` or rules for compiling code, rather than PETSc's? 814 815See the {ref}`section <sec_writing_application_codes>` of the users manual on writing 816application codes with PETSc. 817 818### Can I use CMake to build my own project that depends on PETSc? 819 820See the {ref}`section <sec_writing_application_codes>` of the users manual on writing 821application codes with PETSc. 822 823### How can I put carriage returns in `PetscPrintf()` statements from Fortran? 824 825You can use the same notation as in C, just put a `\n` in the string. Note that no other C 826format instruction is supported. 827 828Or you can use the Fortran concatenation `//` and `char(10)`; for example `'some 829string'//char(10)//'another string` on the next line. 830 831### How can I implement callbacks using C++ class methods? 832 833Declare the class method static. Static methods do not have a `this` pointer, but the 834`void*` context parameter will usually be cast to a pointer to the class where it can 835serve the same function. 836 837:::{admonition} Remember 838All PETSc callbacks return `PetscErrorCode`. 839::: 840 841### Everyone knows that when you code Newton's Method you should compute the function and its Jacobian at the same time. How can one do this in PETSc? 842 843The update in Newton's method is computed as 844 845$$ 846u^{n+1} = u^n - \lambda * \left[J(u^n)] * F(u^n) \right]^{\dagger} 847$$ 848 849The reason PETSc doesn't default to computing both the function and Jacobian at the same 850time is: 851 852- In order to do the line search $F \left(u^n - \lambda * \text{step} \right)$ may 853 need to be computed for several $\lambda$. The Jacobian is not needed for each of 854 those and one does not know in advance which will be the final $\lambda$ until 855 after the function value is computed, so many extra Jacobians may be computed. 856- In the final step if $|| F(u^p)||$ satisfies the convergence criteria then a 857 Jacobian need not be computed. 858 859You are free to have your `FormFunction()` compute as much of the Jacobian at that point 860as you like, keep the information in the user context (the final argument to 861`FormFunction()` and `FormJacobian()`) and then retrieve the information in your 862`FormJacobian()` function. 863 864### Computing the Jacobian or preconditioner is time consuming. Is there any way to compute it less often? 865 866PETSc has a variety of ways of lagging the computation of the Jacobian or the 867preconditioner. They are documented in the manual page for `SNESComputeJacobian()` 868and in the {ref}`users manual <ch_snes>`: 869 870-s 871 872nes_lag_jacobian 873 874(`SNESSetLagJacobian()`) How often Jacobian is rebuilt (use -1 to 875never rebuild, use -2 to rebuild the next time requested and then 876never again). 877 878-s 879 880nes_lag_jacobian_persists 881 882(`SNESSetLagJacobianPersists()`) Forces lagging of Jacobian 883through multiple `SNES` solves , same as passing -2 to 884`-snes_lag_jacobian`. By default, each new `SNES` solve 885normally triggers a recomputation of the Jacobian. 886 887-s 888 889nes_lag_preconditioner 890 891(`SNESSetLagPreconditioner()`) how often the preconditioner is 892rebuilt. Note: if you are lagging the Jacobian the system will 893know that the matrix has not changed and will not recompute the 894(same) preconditioner. 895 896-s 897 898nes_lag_preconditioner_persists 899 900(`SNESSetLagPreconditionerPersists()`) Preconditioner 901lags through multiple `SNES` solves 902 903:::{note} 904These are often (but does not need to be) used in combination with 905`-snes_mf_operator` which applies the fresh Jacobian matrix-free for every 906matrix-vector product. Otherwise the out-of-date matrix vector product, computed with 907the lagged Jacobian will be used. 908::: 909 910By using `KSPMonitorSet()` and/or `SNESMonitorSet()` one can provide code that monitors the 911convergence rate and automatically triggers an update of the Jacobian or preconditioner 912based on decreasing convergence of the iterative method. For example if the number of `SNES` 913iterations doubles one might trigger a new computation of the Jacobian. Experimentation is 914the only general purpose way to determine which approach is best for your problem. 915 916:::{important} 917It is also vital to experiment on your true problem at the scale you will be solving 918the problem since the performance benefits depend on the exact problem and the problem 919size! 920::: 921 922### How can I use Newton's Method Jacobian free? Can I difference a different function than provided with `SNESSetFunction()`? 923 924The simplest way is with the option `-snes_mf`, this will use finite differencing of the 925function provided to `SNESComputeFunction()` to approximate the action of Jacobian. 926 927:::{important} 928Since no matrix-representation of the Jacobian is provided the `-pc_type` used with 929this option must be `-pc_type none`. You may provide a custom preconditioner with 930`SNESGetKSP()`, `KSPGetPC()`, and `PCSetType()` and use `PCSHELL`. 931::: 932 933The option `-snes_mf_operator` will use Jacobian free to apply the Jacobian (in the 934Krylov solvers) but will use whatever matrix you provided with `SNESSetJacobian()` 935(assuming you set one) to compute the preconditioner. 936 937To write the code (rather than use the options above) use `MatCreateSNESMF()` and pass 938the resulting matrix object to `SNESSetJacobian()`. 939 940For purely matrix-free (like `-snes_mf`) pass the matrix object for both matrix 941arguments and pass the function `MatMFFDComputeJacobian()`. 942 943To provide your own approximate Jacobian matrix to compute the preconditioner (like 944`-snes_mf_operator`), pass this other matrix as the second matrix argument to 945`SNESSetJacobian()`. Make sure your provided `computejacobian()` function calls 946`MatAssemblyBegin()` and `MatAssemblyEnd()` separately on **BOTH** matrix arguments 947to this function. See `src/snes/tests/ex7.c`. 948 949To difference a different function than that passed to `SNESSetJacobian()` to compute the 950matrix-free Jacobian multiply call `MatMFFDSetFunction()` to set that other function. See 951`src/snes/tests/ex7.c.h`. 952 953(doc_faq_usage_condnum)= 954 955### How can I determine the condition number of a matrix? 956 957For small matrices, the condition number can be reliably computed using 958 959```text 960-pc_type svd -pc_svd_monitor 961``` 962 963For larger matrices, you can run with 964 965```text 966-pc_type none -ksp_type gmres -ksp_monitor_singular_value -ksp_gmres_restart 1000 967``` 968 969to get approximations to the condition number of the operator. This will generally be 970accurate for the largest singular values, but may overestimate the smallest singular value 971unless the method has converged. Make sure to avoid restarts. To estimate the condition 972number of the preconditioned operator, use `-pc_type somepc` in the last command. 973 974You can use [SLEPc](https://slepc.upv.es) for highly scalable, efficient, and quality eigenvalue computations. 975 976### How can I compute the inverse of a matrix in PETSc? 977 978:::{admonition} Are you sure? 979:class: yellow 980 981It is very expensive to compute the inverse of a matrix and very rarely needed in 982practice. We highly recommend avoiding algorithms that need it. 983::: 984 985The inverse of a matrix (dense or sparse) is essentially always dense, so begin by 986creating a dense matrix B and fill it with the identity matrix (ones along the diagonal), 987also create a dense matrix X of the same size that will hold the solution. Then factor the 988matrix you wish to invert with `MatLUFactor()` or `MatCholeskyFactor()`, call the 989result A. Then call `MatMatSolve(A,B,X)` to compute the inverse into X. See also section 990on {any}`Schur's complement <how_can_i_compute_the_schur_complement>`. 991 992(how_can_i_compute_the_schur_complement)= 993 994### How can I compute the Schur complement in PETSc? 995 996:::{admonition} Are you sure? 997:class: yellow 998 999It is very expensive to compute the Schur complement of a matrix and very rarely needed 1000in practice. We highly recommend avoiding algorithms that need it. 1001::: 1002 1003The Schur complement of the matrix $M \in \mathbb{R}^{\left(p+q \right) \times 1004\left(p + q \right)}$ 1005 1006$$ 1007M = \begin{bmatrix} 1008A & B \\ 1009C & D 1010\end{bmatrix} 1011$$ 1012 1013where 1014 1015$$ 1016A \in \mathbb{R}^{p \times p}, \quad B \in \mathbb{R}^{p \times q}, \quad C \in \mathbb{R}^{q \times p}, \quad D \in \mathbb{R}^{q \times q} 1017$$ 1018 1019is given by 1020 1021$$ 1022S_D := A - BD^{-1}C \\ 1023S_A := D - CA^{-1}B 1024$$ 1025 1026Like the inverse, the Schur complement of a matrix (dense or sparse) is essentially always 1027dense, so assuming you wish to calculate $S_A = D - C \underbrace{ 1028\overbrace{(A^{-1})}^{U} B}_{V}$ begin by: 1029 10301. Forming a dense matrix $B$ 10312. Also create another dense matrix $V$ of the same size. 10323. Then either factor the matrix $A$ directly with `MatLUFactor()` or 1033 `MatCholeskyFactor()`, or use `MatGetFactor()` followed by 1034 `MatLUFactorSymbolic()` followed by `MatLUFactorNumeric()` if you wish to use and 1035 external solver package like SuperLU_Dist. Call the result $U$. 10364. Then call `MatMatSolve(U,B,V)`. 10375. Then call `MatMatMult(C,V,MAT_INITIAL_MATRIX,1.0,&S)`. 10386. Now call `MatAXPY(S,-1.0,D,MAT_SUBSET_NONZERO)`. 10397. Followed by `MatScale(S,-1.0)`. 1040 1041For computing Schur complements like this it does not make sense to use the `KSP` 1042iterative solvers since for solving many moderate size problems using a direct 1043factorization is much faster than iterative solvers. As you can see, this requires a great 1044deal of work space and computation so is best avoided. 1045 1046However, it is not necessary to assemble the Schur complement $S$ in order to solve 1047systems with it. Use `MatCreateSchurComplement(A,A_pre,B,C,D,&S)` to create a 1048matrix that applies the action of $S$ (using `A_pre` to solve with `A`), but 1049does not assemble. 1050 1051Alternatively, if you already have a block matrix `M = [A, B; C, D]` (in some 1052ordering), then you can create index sets (`IS`) `isa` and `isb` to address each 1053block, then use `MatGetSchurComplement()` to create the Schur complement and/or an 1054approximation suitable for preconditioning. 1055 1056Since $S$ is generally dense, standard preconditioning methods cannot typically be 1057applied directly to Schur complements. There are many approaches to preconditioning Schur 1058complements including using the `SIMPLE` approximation 1059 1060$$ 1061D - C \text{diag}(A)^{-1} B 1062$$ 1063 1064to create a sparse matrix that approximates the Schur complement (this is returned by 1065default for the optional "preconditioning" matrix in `MatGetSchurComplement()`). 1066 1067An alternative is to interpret the matrices as differential operators and apply 1068approximate commutator arguments to find a spectrally equivalent operation that can be 1069applied efficiently (see the "PCD" preconditioners {cite}`elman_silvester_wathen_2014`). A 1070variant of this is the least squares commutator, which is closely related to the 1071Moore-Penrose pseudoinverse, and is available in `PCLSC` which operates on matrices of 1072type `MATSCHURCOMPLEMENT`. 1073 1074### Do you have examples of doing unstructured grid Finite Element Method (FEM) with PETSc? 1075 1076There are at least three ways to write finite element codes using PETSc: 1077 10781. Use `DMPLEX`, which is a high level approach to manage your mesh and 1079 discretization. See the {ref}`tutorials sections <tut_stokes>` for further information, 1080 or see `src/snes/tutorial/ex62.c`. 10812. Use packages such as [deal.ii](https://www.dealii.org), [libMesh](https://libmesh.github.io), or 1082 [Firedrake](https://www.firedrakeproject.org), which use PETSc for the solvers. 10833. Manage the grid data structure yourself and use PETSc `PetscSF`, `IS`, and `VecScatter` to 1084 communicate the required ghost point communication. See 1085 `src/snes/tutorials/ex10d/ex10.c`. 1086 1087### DMDA decomposes the domain differently than the MPI_Cart_create() command. How can one use them together? 1088 1089The `MPI_Cart_create()` first divides the mesh along the z direction, then the y, then 1090the x. `DMDA` divides along the x, then y, then z. Thus, for example, rank 1 of the 1091processes will be in a different part of the mesh for the two schemes. To resolve this you 1092can create a new MPI communicator that you pass to `DMDACreate()` that renumbers the 1093process ranks so that each physical process shares the same part of the mesh with both the 1094`DMDA` and the `MPI_Cart_create()`. The code to determine the new numbering was 1095provided by Rolf Kuiper: 1096 1097``` 1098// the numbers of processors per direction are (int) x_procs, y_procs, z_procs respectively 1099// (no parallelization in direction 'dir' means dir_procs = 1) 1100 1101MPI_Comm NewComm; 1102int x, y, z; 1103PetscMPIInt MPI_Rank, NewRank; 1104 1105// get rank from MPI ordering: 1106PetscCallMPI(MPI_Comm_rank(MPI_COMM_WORLD, &MPI_Rank)); 1107 1108// calculate coordinates of cpus in MPI ordering: 1109x = MPI_rank / (z_procs*y_procs); 1110y = (MPI_rank % (z_procs*y_procs)) / z_procs; 1111z = (MPI_rank % (z_procs*y_procs)) % z_procs; 1112 1113// set new rank according to PETSc ordering: 1114NewRank = z*y_procs*x_procs + y*x_procs + x; 1115 1116// create communicator with new ranks according to PETSc ordering 1117PetscCallMPI(MPI_Comm_split(PETSC_COMM_WORLD, 1, NewRank, &NewComm)); 1118 1119// override the default communicator (was MPI_COMM_WORLD as default) 1120PETSC_COMM_WORLD = NewComm; 1121``` 1122 1123### When solving a system with Dirichlet boundary conditions I can use MatZeroRows() to eliminate the Dirichlet rows but this results in a non-Symmetric system. How can I apply Dirichlet boundary conditions but keep the matrix symmetric? 1124 1125- For nonsymmetric systems put the appropriate boundary solutions in the x vector and use 1126 `MatZeroRows()` followed by `KSPSetOperators()`. 1127- For symmetric problems use `MatZeroRowsColumns()`. 1128- If you have many Dirichlet locations you can use `MatZeroRows()` (**not** 1129 `MatZeroRowsColumns()`) and `-ksp_type preonly -pc_type redistribute` (see 1130 `PCREDISTRIBUTE`) and PETSc will repartition the parallel matrix for load 1131 balancing. In this case the new matrix solved remains symmetric even though 1132 `MatZeroRows()` is used. 1133 1134An alternative approach is, when assembling the matrix (generating values and passing 1135them to the matrix), never to include locations for the Dirichlet grid points in the 1136vector and matrix, instead taking them into account as you put the other values into the 1137load. 1138 1139### How can I get PETSc vectors and matrices to MATLAB or vice versa? 1140 1141There are numerous ways to work with PETSc and MATLAB. All but the first approach 1142require PETSc to be configured with --with-matlab. 1143 11441. To save PETSc `Mat` and `Vec` to files that can be read from MATLAB use 1145 `PetscViewerBinaryOpen()` viewer and `VecView()` or `MatView()` to save objects 1146 for MATLAB and `VecLoad()` and `MatLoad()` to get the objects that MATLAB has 1147 saved. See `share/petsc/matlab/PetscBinaryRead.m` and 1148 `share/petsc/matlab/PetscBinaryWrite.m` for loading and saving the objects in MATLAB. 11492. Using the [MATLAB Engine](https://www.mathworks.com/help/matlab/calling-matlab-engine-from-c-programs-1.html), 1150 allows PETSc to automatically call MATLAB to perform some specific computations. This 1151 does not allow MATLAB to be used interactively by the user. See the 1152 `PetscMatlabEngine`. 11533. You can open a socket connection between MATLAB and PETSc to allow sending objects back 1154 and forth between an interactive MATLAB session and a running PETSc program. See 1155 `PetscViewerSocketOpen()` for access from the PETSc side and 1156 `share/petsc/matlab/PetscReadBinary.m` for access from the MATLAB side. 11574. You can save PETSc `Vec` (**not** `Mat`) with the `PetscViewerMatlabOpen()` 1158 viewer that saves `.mat` files can then be loaded into MATLAB using the `load()` command 1159 1160### How do I get started with Cython so that I can extend petsc4py? 1161 11621. Learn how to [build a Cython module](http://docs.cython.org/src/quickstart/build.html). 11632. Go through the simple [example](https://stackoverflow.com/questions/3046305/simple-wrapping-of-c-code-with-cython). Note 1164 also the next comment that shows how to create numpy arrays in the Cython and pass them 1165 back. 11663. Check out [this page](http://docs.cython.org/src/tutorial/numpy.html) which tells 1167 you how to get fast indexing. 11684. Have a look at the petsc4py [array source](http://code.google.com/p/petsc4py/source/browse/src/PETSc/arraynpy.pxi). 1169 1170### How do I compute a custom norm for KSP to use as a convergence test or for monitoring? 1171 1172You need to call `KSPBuildResidual()` on your `KSP` object and then compute the 1173appropriate norm on the resulting residual. Note that depending on the 1174`KSPSetNormType()` of the method you may not return the same norm as provided by the 1175method. See also `KSPSetPCSide()`. 1176 1177### If I have a sequential program can I use a PETSc parallel solver? 1178 1179:::{important} 1180Do not expect to get great speedups! Much of the speedup gained by using parallel 1181solvers comes from building the underlying matrices and vectors in parallel to begin 1182with. You should see some reduction in the time for the linear solvers. 1183::: 1184 1185Yes, you must set up PETSc with MPI (even though you will not use MPI) with at least the 1186following options: 1187 1188```console 1189$ ./configure --download-superlu_dist --download-parmetis --download-metis --with-openmp 1190``` 1191 1192Your compiler must support OpenMP. To have the linear solver run in parallel run your 1193program with 1194 1195```console 1196$ OMP_NUM_THREADS=n ./myprog -pc_type lu -pc_factor_mat_solver superlu_dist 1197``` 1198 1199where `n` is the number of threads and should be less than or equal to the number of cores 1200available. 1201 1202:::{note} 1203If your code is MPI parallel you can also use these same options to have SuperLU_dist 1204utilize multiple threads per MPI process for the direct solver. Make sure that the 1205`$OMP_NUM_THREADS` you use per MPI process is less than or equal to the number of 1206cores available for each MPI process. For example if your compute nodes have 6 cores 1207and you use 2 MPI processes per node then set `$OMP_NUM_THREADS` to 2 or 3. 1208::: 1209 1210Another approach that allows using a PETSc parallel solver is to use `PCMPI`. 1211 1212### TS or SNES produces infeasible (out of domain) solutions or states. How can I prevent this? 1213 1214For `TS` call `TSSetFunctionDomainError()`. For both `TS` and `SNES` call 1215`SNESSetFunctionDomainError()` when the solver passes an infeasible (out of domain) 1216solution or state to your routines. 1217 1218If it occurs for DAEs, it is important to insure the algebraic constraints are well 1219satisfied, which can prevent "breakdown" later. Thus, one can try using a tight tolerance 1220for `SNES`, using a direct linear solver (`PCType` of `PCLU`) when possible, and reducing the timestep (or 1221tightening `TS` tolerances for adaptive time stepping). 1222 1223### Can PETSc work with Hermitian matrices? 1224 1225PETSc's support of Hermitian matrices is limited. Many operations and solvers work 1226for symmetric (`MATSBAIJ`) matrices and operations on transpose matrices but there is 1227little direct support for Hermitian matrices and Hermitian transpose (complex conjugate 1228transpose) operations. There is `KSPSolveTranspose()` for solving the transpose of a 1229linear system but no `KSPSolveHermitian()`. 1230 1231For creating known Hermitian matrices: 1232 1233- `MatCreateNormalHermitian()` 1234- `MatCreateHermitianTranspose()` 1235 1236For determining or setting Hermitian status on existing matrices: 1237 1238- `MatIsHermitian()` 1239- `MatIsHermitianKnown()` 1240- `MatIsStructurallySymmetric()` 1241- `MatIsSymmetricKnown()` 1242- `MatIsSymmetric()` 1243- `MatSetOption()` (use with `MAT_SYMMETRIC` or `MAT_HERMITIAN` to assert to PETSc 1244 that either is the case). 1245 1246For performing matrix operations on known Hermitian matrices (note that regular `Mat` 1247functions such as `MatMult()` will of course also work on Hermitian matrices): 1248 1249- `MatMultHermitianTranspose()` 1250- `MatMultHermitianTransposeAdd()` (very limited support) 1251 1252### How can I assemble a bunch of similar matrices? 1253 1254You can first add the values common to all the matrices, then use `MatStoreValues()` to 1255stash the common values. Each iteration you call `MatRetrieveValues()`, then set the 1256unique values in a matrix and assemble. 1257 1258### Can one resize or change the size of PETSc matrices or vectors? 1259 1260No, once the vector or matrices sizes have been set and the matrices or vectors are fully 1261usable one cannot change the size of the matrices or vectors or number of processors they 1262live on. One may create new vectors and copy, for example using `VecScatterCreate()`, 1263the old values from the previous vector. 1264 1265### How can one compute the nullspace of a sparse matrix with MUMPS? 1266 1267Assuming you have an existing matrix $A$ whose nullspace $V$ you want to find: 1268 1269``` 1270Mat F, work, V; 1271PetscInt N, rows; 1272 1273/* Determine factorability */ 1274PetscCall(MatGetFactor(A, MATSOLVERMUMPS, MAT_FACTOR_LU, &F)); 1275PetscCall(MatGetLocalSize(A, &rows, NULL)); 1276 1277/* Set MUMPS options, see MUMPS documentation for more information */ 1278PetscCall(MatMumpsSetIcntl(F, 24, 1)); 1279PetscCall(MatMumpsSetIcntl(F, 25, 1)); 1280 1281/* Perform factorization */ 1282PetscCall(MatLUFactorSymbolic(F, A, NULL, NULL, NULL)); 1283PetscCall(MatLUFactorNumeric(F, A, NULL)); 1284 1285/* This is the dimension of the null space */ 1286PetscCall(MatMumpsGetInfog(F, 28, &N)); 1287 1288/* This will contain the null space in the columns */ 1289PetscCall(MatCreateDense(comm, rows, N, PETSC_DETERMINE, PETSC_DETERMINE, NULL, &V)); 1290 1291PetscCall(MatDuplicate(V, MAT_DO_NOT_COPY_VALUES, &work)); 1292PetscCall(MatMatSolve(F, work, V)); 1293``` 1294 1295______________________________________________________________________ 1296 1297## Execution 1298 1299### PETSc executables are SO big and take SO long to link 1300 1301:::{note} 1302See {ref}`shared libraries section <doc_faq_sharedlibs>` for more information. 1303::: 1304 1305We find this annoying as well. On most machines PETSc can use shared libraries, so 1306executables should be much smaller, run `configure` with the additional option 1307`--with-shared-libraries` (this is the default). Also, if you have room, compiling and 1308linking PETSc on your machine's `/tmp` disk or similar local disk, rather than over the 1309network will be much faster. 1310 1311### How does PETSc's `-help` option work? Why is it different for different programs? 1312 1313There are 2 ways in which one interacts with the options database: 1314 1315- `PetscOptionsGetXXX()` where `XXX` is some type or data structure (for example 1316 `PetscOptionsGetBool()` or `PetscOptionsGetScalarArray()`). This is a classic 1317 "getter" function, which queries the command line options for a matching option name, 1318 and returns the specified value. 1319- `PetscOptionsXXX()` where `XXX` is some type or data structure (for example 1320 `PetscOptionsBool()` or `PetscOptionsScalarArray()`). This is a so-called "provider" 1321 function. It first records the option name in an internal list of previously encountered 1322 options before calling `PetscOptionsGetXXX()` to query the status of said option. 1323 1324While users generally use the first option, developers will *always* use the second 1325(provider) variant of functions. Thus, as the program runs, it will build up a list of 1326encountered option names which are then printed **in the order of their appearance on the 1327root rank**. Different programs may take different paths through PETSc source code, so 1328they will encounter different providers, and therefore have different `-help` output. 1329 1330### PETSc has so many options for my program that it is hard to keep them straight 1331 1332Running the PETSc program with the option `-help` will print out many of the options. To 1333print the options that have been specified within a program, employ `-options_left` to 1334print any options that the user specified but were not actually used by the program and 1335all options used; this is helpful for detecting typo errors. The PETSc website has a search option, 1336in the upper right hand corner, that quickly finds answers to most PETSc questions. 1337 1338### PETSc automatically handles many of the details in parallel PDE solvers. How can I understand what is really happening within my program? 1339 1340You can use the option `-info` to get more details about the solution process. The 1341option `-log_view` provides details about the distribution of time spent in the various 1342phases of the solution process. You can run with `-ts_view` or `-snes_view` or 1343`-ksp_view` to see what solver options are being used. Run with `-ts_monitor`, 1344`-snes_monitor`, or `-ksp_monitor` to watch convergence of the 1345methods. `-snes_converged_reason` and `-ksp_converged_reason` will indicate why and if 1346the solvers have converged. 1347 1348### Assembling large sparse matrices takes a long time. What can I do to make this process faster? Or MatSetValues() is so slow; what can I do to speed it up? 1349 1350You probably need to do preallocation, as explained in {any}`sec_matsparse`. 1351See also the {ref}`performance chapter <ch_performance>` of the users manual. 1352 1353For GPUs (and even CPUs) you can use `MatSetPreallocationCOO()` and `MatSetValuesCOO()` for more rapid assembly. 1354 1355### How can I generate performance summaries with PETSc? 1356 1357Use this option at runtime: 1358 1359-l 1360 1361og_view 1362 1363Outputs a comprehensive timing, memory consumption, and communications digest 1364for your program. See the {ref}`profiling chapter <ch_profiling>` of the users 1365manual for information on interpreting the summary data. 1366 1367### How do I know the amount of time spent on each level of the multigrid solver/preconditioner? 1368 1369Run with `-log_view` and `-pc_mg_log` 1370 1371### Where do I get the input matrices for the examples? 1372 1373Some examples use `$DATAFILESPATH/matrices/medium` and other files. These test matrices 1374in PETSc binary format can be found in the [datafiles repository](https://gitlab.com/petsc/datafiles). 1375 1376### When I dump some matrices and vectors to binary, I seem to be generating some empty files with `.info` extensions. What's the deal with these? 1377 1378PETSc binary viewers put some additional information into `.info` files like matrix 1379block size. It is harmless but if you *really* don't like it you can use 1380`-viewer_binary_skip_info` or `PetscViewerBinarySkipInfo()`. 1381 1382:::{note} 1383You need to call `PetscViewerBinarySkipInfo()` before 1384`PetscViewerFileSetName()`. In other words you **cannot** use 1385`PetscViewerBinaryOpen()` directly. 1386::: 1387 1388### Why is my parallel solver slower than my sequential solver, or I have poor speed-up? 1389 1390This can happen for many reasons: 1391 13921. Make sure it is truly the time in `KSPSolve()` that is slower (by running the code 1393 with `-log_view`). Often the slower time is in generating the matrix or some other 1394 operation. 13952. There must be enough work for each process to outweigh the communication time. We 1396 recommend an absolute minimum of about 10,000 unknowns per process, better is 20,000 or 1397 more. This is even more true when using multiple GPUs, where you need to have millions 1398 of unknowns per GPU. 13993. Make sure the {ref}`communication speed of the parallel computer 1400 <doc_faq_general_parallel>` is good enough for parallel solvers. 14014. Check the number of solver iterates with the parallel solver against the sequential 1402 solver. Most preconditioners require more iterations when used on more processes, this 1403 is particularly true for block Jacobi (the default parallel preconditioner). You can 1404 try `-pc_type asm` (`PCASM`) its iterations scale a bit better for more 1405 processes. You may also consider multigrid preconditioners like `PCMG` or BoomerAMG 1406 in `PCHYPRE`. 1407 1408(doc_faq_pipelined)= 1409 1410### What steps are necessary to make the pipelined solvers execute efficiently? 1411 1412Pipelined solvers like `KSPPGMRES`, `KSPPIPECG`, `KSPPIPECR`, and `KSPGROPPCG` may 1413require special MPI configuration to effectively overlap reductions with computation. In 1414general, this requires an MPI-3 implementation, an implementation that supports multiple 1415threads, and use of a "progress thread". Consult the documentation from your vendor or 1416computing facility for more details. 1417 1418- Cray MPI MPT-5.6 MPI-3, by setting `$MPICH_MAX_THREAD_SAFETY` to "multiple" 1419 for threads, plus either `$MPICH_ASYNC_PROGRESS` or 1420 `$MPICH_NEMESIS_ASYNC_PROGRESS`. E.g. 1421 ```console 1422 $ export MPICH_MAX_THREAD_SAFETY=multiple 1423 $ export MPICH_ASYNC_PROGRESS=1 1424 $ export MPICH_NEMESIS_ASYNC_PROGRESS=1 1425 ``` 1426 1427- MPICH version 3.0 and later implements the MPI-3 standard and the default 1428 configuration supports use of threads. Use of a progress thread is configured by 1429 setting the environment variable `$MPICH_ASYNC_PROGRESS`. E.g. 1430 ```console 1431 $ export MPICH_ASYNC_PROGRESS=1 1432 ``` 1433 1434### When using PETSc in single precision mode (`--with-precision=single` when running `configure`) are the operations done in single or double precision? 1435 1436PETSc does **NOT** do any explicit conversion of single precision to double before 1437performing computations; it depends on the hardware and compiler for what happens. For 1438example, the compiler could choose to put the single precision numbers into the usual 1439double precision registers and then use the usual double precision floating point unit. Or 1440it could use SSE2 instructions that work directly on the single precision numbers. It is a 1441bit of a mystery what decisions get made sometimes. There may be compiler flags in some 1442circumstances that can affect this. 1443 1444### Why is Newton's method (SNES) not converging, or converges slowly? 1445 1446Newton's method may not converge for many reasons, here are some of the most common: 1447 1448- The Jacobian is wrong (or correct in sequential but not in parallel). 1449- The linear system is {ref}`not solved <doc_faq_execution_kspconv>` or is not solved 1450 accurately enough. 1451- The Jacobian system has a singularity that the linear solver is not handling. 1452- There is a bug in the function evaluation routine. 1453- The function is not continuous or does not have continuous first derivatives (e.g. phase 1454 change or TVD limiters). 1455- The equations may not have a solution (e.g. limit cycle instead of a steady state) or 1456 there may be a "hill" between the initial guess and the steady state (e.g. reactants 1457 must ignite and burn before reaching a steady state, but the steady-state residual will 1458 be larger during combustion). 1459 1460Here are some of the ways to help debug lack of convergence of Newton: 1461 1462- Run on one processor to see if the problem is only in parallel. 1463 1464- Run with `-info` to get more detailed information on the solution process. 1465 1466- Run with the options 1467 1468 ```text 1469 -snes_monitor -ksp_monitor_true_residual -snes_converged_reason -ksp_converged_reason 1470 ``` 1471 1472 - If the linear solve does not converge, check if the Jacobian is correct, then see 1473 {ref}`this question <doc_faq_execution_kspconv>`. 1474 - If the preconditioned residual converges, but the true residual does not, the 1475 preconditioner may be singular. 1476 - If the linear solve converges well, but the line search fails, the Jacobian may be 1477 incorrect. 1478 1479- Run with `-pc_type lu` or `-pc_type svd` to see if the problem is a poor linear 1480 solver. 1481 1482- Run with `-mat_view` or `-mat_view draw` to see if the Jacobian looks reasonable. 1483 1484- Run with `-snes_test_jacobian -snes_test_jacobian_view` to see if the Jacobian you are 1485 using is wrong. Compare the output when you add `-mat_fd_type ds` to see if the result 1486 is sensitive to the choice of differencing parameter. 1487 1488- Run with `-snes_mf_operator -pc_type lu` to see if the Jacobian you are using is 1489 wrong. If the problem is too large for a direct solve, try 1490 1491 ```text 1492 -snes_mf_operator -pc_type ksp -ksp_ksp_rtol 1e-12. 1493 ``` 1494 1495 Compare the output when you add `-mat_mffd_type ds` to see if the result is sensitive 1496 to choice of differencing parameter. 1497 1498- Run with `-snes_linesearch_monitor` to see if the line search is failing (this is 1499 usually a sign of a bad Jacobian). Use `-info` in PETSc 3.1 and older versions, 1500 `-snes_ls_monitor` in PETSc 3.2 and `-snes_linesearch_monitor` in PETSc 3.3 and 1501 later. 1502 1503Here are some ways to help the Newton process if everything above checks out: 1504 1505- Run with grid sequencing (`-snes_grid_sequence` if working with a `DM` is all you 1506 need) to generate better initial guess on your finer mesh. 1507 1508- Run with quad precision, i.e. 1509 1510 ```console 1511 $ ./configure --with-precision=__float128 --download-f2cblaslapack 1512 ``` 1513 1514 :::{note} 1515 quad precision requires PETSc 3.2 and later and recent versions of the GNU compilers. 1516 ::: 1517 1518- Change the units (nondimensionalization), boundary condition scaling, or formulation so 1519 that the Jacobian is better conditioned. See [Buckingham pi theorem](https://en.wikipedia.org/wiki/Buckingham_%CF%80_theorem) and [Dimensional and 1520 Scaling Analysis](https://epubs.siam.org/doi/pdf/10.1137/16M1107127). 1521 1522- Mollify features in the function that do not have continuous first derivatives (often 1523 occurs when there are "if" statements in the residual evaluation, e.g. phase change or 1524 TVD limiters). Use a variational inequality solver (`SNESVINEWTONRSLS`) if the 1525 discontinuities are of fundamental importance. 1526 1527- Try a trust region method (`-ts_type tr`, may have to adjust parameters). 1528 1529- Run with some continuation parameter from a point where you know the solution, see 1530 `TSPSEUDO` for steady-states. 1531 1532- There are homotopy solver packages like PHCpack that can get you all possible solutions 1533 (and tell you that it has found them all) but those are not scalable and cannot solve 1534 anything but small problems. 1535 1536(doc_faq_execution_kspconv)= 1537 1538### Why is the linear solver (KSP) not converging, or converges slowly? 1539 1540:::{tip} 1541Always run with `-ksp_converged_reason -ksp_monitor_true_residual` when trying to 1542learn why a method is not converging! 1543::: 1544 1545Common reasons for KSP not converging are: 1546 1547- A symmetric method is being used for a non-symmetric problem. 1548 1549- The equations are singular by accident (e.g. forgot to impose boundary 1550 conditions). Check this for a small problem using `-pc_type svd -pc_svd_monitor`. 1551 1552- The equations are intentionally singular (e.g. constant null space), but the Krylov 1553 method was not informed, see `MatSetNullSpace()`. Always inform your local Krylov 1554 subspace solver of any change of singularity. Failure to do so will result in the 1555 immediate revocation of your computing and keyboard operator licenses, as well as 1556 a stern talking-to by the nearest Krylov Subspace Method representative. 1557 1558- The equations are intentionally singular and `MatSetNullSpace()` was used, but the 1559 right-hand side is not consistent. You may have to call `MatNullSpaceRemove()` on the 1560 right-hand side before calling `KSPSolve()`. See `MatSetTransposeNullSpace()`. 1561 1562- The equations are indefinite so that standard preconditioners don't work. Usually you 1563 will know this from the physics, but you can check with 1564 1565 ```text 1566 -ksp_compute_eigenvalues -ksp_gmres_restart 1000 -pc_type none 1567 ``` 1568 1569 For simple saddle point problems, try 1570 1571 ```text 1572 -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_detect_saddle_point 1573 ``` 1574 1575 For more difficult problems, read the literature to find robust methods and ask 1576 <mailto:petsc-users@mcs.anl.gov> or <mailto:petsc-maint@mcs.anl.gov> if you want advice about how to 1577 implement them. 1578 1579- If the method converges in preconditioned residual, but not in true residual, the 1580 preconditioner is likely singular or nearly so. This is common for saddle point problems 1581 (e.g. incompressible flow) or strongly nonsymmetric operators (e.g. low-Mach hyperbolic 1582 problems with large time steps). 1583 1584- The preconditioner is too weak or is unstable. See if `-pc_type asm -sub_pc_type lu` 1585 improves the convergence rate. If GMRES is losing too much progress in the restart, see 1586 if longer restarts help `-ksp_gmres_restart 300`. If a transpose is available, try 1587 `-ksp_type bcgs` or other methods that do not require a restart. 1588 1589 :::{note} 1590 Unfortunately convergence with these methods is frequently erratic. 1591 ::: 1592 1593- The preconditioner is nonlinear (e.g. a nested iterative solve), try `-ksp_type 1594 fgmres` or `-ksp_type gcr`. 1595 1596- You are using geometric multigrid, but some equations (often boundary conditions) are 1597 not scaled compatibly between levels. Try `-pc_mg_galerkin` both to algebraically 1598 construct a correctly scaled coarse operator or make sure that all the equations are 1599 scaled in the same way if you want to use rediscretized coarse levels. 1600 1601- The matrix is very ill-conditioned. Check the {ref}`condition number <doc_faq_usage_condnum>`. 1602 1603 - Try to improve it by choosing the relative scaling of components/boundary conditions. 1604 - Try `-ksp_diagonal_scale -ksp_diagonal_scale_fix`. 1605 - Perhaps change the formulation of the problem to produce more friendly algebraic 1606 equations. 1607 1608- Change the units (nondimensionalization), boundary condition scaling, or formulation so 1609 that the Jacobian is better conditioned. See [Buckingham pi theorem](https://en.wikipedia.org/wiki/Buckingham_%CF%80_theorem) and [Dimensional and 1610 Scaling Analysis](https://epubs.siam.org/doi/pdf/10.1137/16M1107127). 1611 1612- Classical Gram-Schmidt is becoming unstable, try `-ksp_gmres_modifiedgramschmidt` or 1613 use a method that orthogonalizes differently, e.g. `-ksp_type gcr`. 1614 1615### I get the error message: Actual argument at (1) to assumed-type dummy is of derived type with type-bound or FINAL procedures 1616 1617Use the following code-snippet: 1618 1619```fortran 1620module context_module 1621#include petsc/finclude/petsc.h 1622use petsc 1623implicit none 1624private 1625type, public :: context_type 1626 private 1627 PetscInt :: foo 1628contains 1629 procedure, public :: init => context_init 1630end type context_type 1631contains 1632subroutine context_init(self, foo) 1633 class(context_type), intent(in out) :: self 1634 PetscInt, intent(in) :: foo 1635 self%foo = foo 1636end subroutine context_init 1637end module context_module 1638 1639!------------------------------------------------------------------------ 1640 1641program test_snes 1642use,intrinsic :: iso_c_binding 1643use petsc 1644use context_module 1645implicit none 1646 1647SNES :: snes 1648type(context_type),target :: context 1649type(c_ptr) :: contextOut 1650PetscErrorCode :: ierr 1651 1652call PetscInitialize(PETSC_NULL_CHARACTER, ierr) 1653call SNESCreate(PETSC_COMM_WORLD, snes, ierr) 1654call context%init(1) 1655 1656contextOut = c_loc(context) ! contextOut is a C pointer on context 1657 1658call SNESSetConvergenceTest(snes, convergence, contextOut, PETSC_NULL_FUNCTION, ierr) 1659 1660call SNESDestroy(snes, ierr) 1661call PetscFinalize(ierr) 1662 1663contains 1664 1665subroutine convergence(snes, num_iterations, xnorm, pnorm,fnorm, reason, contextIn, ierr) 1666SNES, intent(in) :: snes 1667 1668PetscInt, intent(in) :: num_iterations 1669PetscReal, intent(in) :: xnorm, pnorm, fnorm 1670SNESConvergedReason, intent(out) :: reason 1671type(c_ptr), intent(in out) :: contextIn 1672type(context_type), pointer :: context 1673PetscErrorCode, intent(out) :: ierr 1674 1675call c_f_pointer(contextIn,context) ! convert the C pointer to a Fortran pointer to use context as in the main program 1676reason = 0 1677ierr = 0 1678end subroutine convergence 1679end program test_snes 1680``` 1681 1682### In C++ I get a crash on `VecDestroy()` (or some other PETSc object) at the end of the program 1683 1684This can happen when the destructor for a C++ class is automatically called at the end of 1685the program after `PetscFinalize()`. Use the following code-snippet: 1686 1687``` 1688main() 1689{ 1690 PetscCall(PetscInitialize()); 1691 { 1692 your variables 1693 your code 1694 1695 ... /* all your destructors are called here automatically by C++ so they work correctly */ 1696 } 1697 PetscCall(PetscFinalize()); 1698 return 0 1699} 1700``` 1701 1702______________________________________________________________________ 1703 1704## Debugging 1705 1706### What does the message hwloc/linux: Ignoring PCI device with non-16bit domain mean? 1707 1708This is printed by the hwloc library that is used by some MPI implementations. It can be ignored. 1709To prevent the message from always being printed set the environmental variable `HWLOC_HIDE_ERRORS` to 2. 1710For example 1711 1712``` 1713export HWLOC_HIDE_ERRORS=2 1714``` 1715 1716which can be added to your login profile file such as `~/.bashrc`. 1717 1718### How do I turn off PETSc signal handling so I can use the `-C` Option On `xlf`? 1719 1720Immediately after calling `PetscInitialize()` call `PetscPopSignalHandler()`. 1721 1722Some Fortran compilers including the IBM xlf, xlF etc compilers have a compile option 1723(`-C` for IBM's) that causes all array access in Fortran to be checked that they are 1724in-bounds. This is a great feature but does require that the array dimensions be set 1725explicitly, not with a \*. 1726 1727### How do I debug if `-start_in_debugger` does not work on my machine? 1728 1729The script <https://github.com/Azrael3000/tmpi> can be used to run multiple MPI 1730ranks in the debugger using tmux. 1731 1732On newer macOS machines - one has to be in admin group to be able to use debugger. 1733 1734On newer Ubuntu linux machines - one has to disable `ptrace_scope` with 1735 1736```console 1737$ sudo echo 0 > /proc/sys/kernel/yama/ptrace_scope 1738``` 1739 1740to get start in debugger working. 1741 1742If `-start_in_debugger` does not work on your OS, for a uniprocessor job, just 1743try the debugger directly, for example: `gdb ex1`. You can also use [TotalView](https://totalview.io/products/totalview) which is a good graphical parallel debugger. 1744 1745### How do I see where my code is hanging? 1746 1747You can use the `-start_in_debugger` option to start all processes in the debugger (each 1748will come up in its own xterm) or run in [TotalView](https://totalview.io/products/totalview). Then use `cont` (for continue) in each 1749xterm. Once you are sure that the program is hanging, hit control-c in each xterm and then 1750use 'where' to print a stack trace for each process. 1751 1752### How can I inspect PETSc vector and matrix values when in the debugger? 1753 1754I will illustrate this with `gdb`, but it should be similar on other debuggers. You can 1755look at local `Vec` values directly by obtaining the array. For a `Vec` v, we can 1756print all local values using: 1757 1758```console 1759(gdb) p ((Vec_Seq*) v->data)->array[0]@v->map.n 1760``` 1761 1762However, this becomes much more complicated for a matrix. Therefore, it is advisable to use the default viewer to look at the object. For a `Vec` v and a `Mat` m, this would be: 1763 1764```console 1765(gdb) call VecView(v, 0) 1766(gdb) call MatView(m, 0) 1767``` 1768 1769or with a communicator other than `MPI_COMM_WORLD`: 1770 1771```console 1772(gdb) call MatView(m, PETSC_VIEWER_STDOUT_(m->comm)) 1773``` 1774 1775Totalview 8.8.0+ has a new feature that allows libraries to provide their own code to 1776display objects in the debugger. Thus in theory each PETSc object, `Vec`, `Mat` etc 1777could have custom code to print values in the object. We have only done this for the most 1778elementary display of `Vec` and `Mat`. See the routine `TV_display_type()` in 1779`src/vec/vec/interface/vector.c` for an example of how these may be written. Contact us 1780if you would like to add more. 1781 1782### How can I find the cause of floating point exceptions like not-a-number (NaN) or infinity? 1783 1784The best way to locate floating point exceptions is to use a debugger. On supported 1785architectures (including Linux and glibc-based systems), just run in a debugger and pass 1786`-fp_trap` to the PETSc application. This will activate signaling exceptions and the 1787debugger will break on the line that first divides by zero or otherwise generates an 1788exceptions. 1789 1790Without a debugger, running with `-fp_trap` in debug mode will only identify the 1791function in which the error occurred, but not the line or the type of exception. If 1792`-fp_trap` is not supported on your architecture, consult the documentation for your 1793debugger since there is likely a way to have it catch exceptions. 1794 1795(error_libimf)= 1796 1797### Error while loading shared libraries: libimf.so: cannot open shared object file: No such file or directory 1798 1799The Intel compilers use shared libraries (like libimf) that cannot be found, by default, at run 1800time. When using the Intel compilers (and running the resulting code) you must make sure 1801that the proper Intel initialization scripts are run. This is usually done by adding some 1802code into your `.cshrc`, `.bashrc`, `.profile` etc file. Sometimes on batch file 1803systems that do now access your initialization files (like .cshrc) you must include the 1804initialization calls in your batch file submission. 1805 1806For example, on my Mac using `csh` I have the following in my `.cshrc` file: 1807 1808```csh 1809source /opt/intel/cc/10.1.012/bin/iccvars.csh 1810source /opt/intel/fc/10.1.012/bin/ifortvars.csh 1811source /opt/intel/idb/10.1.012/bin/idbvars.csh 1812``` 1813 1814And in my `.profile` I have 1815 1816```csh 1817source /opt/intel/cc/10.1.012/bin/iccvars.sh 1818source /opt/intel/fc/10.1.012/bin/ifortvars.sh 1819source /opt/intel/idb/10.1.012/bin/idbvars.sh 1820``` 1821 1822(object_type_not_set)= 1823 1824### What does "Object Type Not Set: Argument # N" Mean? 1825 1826Many operations on PETSc objects require that the specific type of the object be set before the operations is performed. You must call `XXXSetType()` or `XXXSetFromOptions()` before you make the offending call. For example 1827 1828``` 1829Mat A; 1830 1831PetscCall(MatCreate(PETSC_COMM_WORLD, &A)); 1832PetscCall(MatSetValues(A,....)); 1833``` 1834 1835will not work. You must add `MatSetType()` or `MatSetFromOptions()` before the call to `MatSetValues()`. I.e. 1836 1837``` 1838Mat A; 1839 1840PetscCall(MatCreate(PETSC_COMM_WORLD, &A)); 1841 1842PetscCall(MatSetType(A, MATAIJ)); 1843/* Will override MatSetType() */ 1844PetscCall(MatSetFromOptions()); 1845 1846PetscCall(MatSetValues(A,....)); 1847``` 1848 1849(split_ownership)= 1850 1851### What does error detected in `PetscSplitOwnership()` about "sum of local lengths ...": mean? 1852 1853In a previous call to `VecSetSizes()`, `MatSetSizes()`, `VecCreateXXX()` or 1854`MatCreateXXX()` you passed in local and global sizes that do not make sense for the 1855correct number of processors. For example if you pass in a local size of 2 and a global 1856size of 100 and run on two processors, this cannot work since the sum of the local sizes 1857is 4, not 100. 1858 1859(valgrind)= 1860 1861### What does "corrupt argument" or "caught signal" Or "SEGV" Or "segmentation violation" Or "bus error" mean? Can I use Valgrind or CUDA-Memcheck to debug memory corruption issues? 1862 1863Sometimes it can mean an argument to a function is invalid. In Fortran this may be caused 1864by forgetting to list an argument in the call, especially the final `PetscErrorCode`. 1865 1866Otherwise it is usually caused by memory corruption; that is somewhere the code is writing 1867out of array bounds. To track this down rerun the debug version of the code with the 1868option `-malloc_debug`. Occasionally the code may crash only with the optimized version, 1869in that case run the optimized version with `-malloc_debug`. If you determine the 1870problem is from memory corruption you can put the macro CHKMEMQ in the code near the crash 1871to determine exactly what line is causing the problem. 1872 1873If `-malloc_debug` does not help: on NVIDIA CUDA systems you can use <https://docs.nvidia.com/cuda/cuda-memcheck/index.html> 1874 1875If `-malloc_debug` does not help: on GNU/Linux (not macOS machines) - you can 1876use [valgrind](http://valgrind.org). Follow the below instructions: 1877 18781. `configure` PETSc with `--download-mpich --with-debugging` (you can use other MPI implementations but most produce spurious Valgrind messages) 1879 18802. Compile your application code with this build of PETSc. 1881 18823. Run with Valgrind. 1883 1884 ```console 1885 $ $PETSC_DIR/lib/petsc/bin/petscmpiexec -valgrind -n NPROC PETSCPROGRAMNAME PROGRAMOPTIONS 1886 ``` 1887 1888 or 1889 1890 ```console 1891 $ mpiexec -n NPROC valgrind --tool=memcheck -q --num-callers=20 \ 1892 --suppressions=$PETSC_DIR/share/petsc/suppressions/valgrind \ 1893 --log-file=valgrind.log.%p PETSCPROGRAMNAME -malloc off PROGRAMOPTIONS 1894 ``` 1895 1896:::{note} 1897- option `--with-debugging` enables valgrind to give stack trace with additional 1898 source-file:line-number info. 1899- option `--download-mpich` is valgrind clean, other MPI builds are not valgrind clean. 1900- when `--download-mpich` is used - mpiexec will be in `$PETSC_ARCH/bin` 1901- `--log-file=valgrind.log.%p` option tells valgrind to store the output from each 1902 process in a different file \[as %p i.e PID, is different for each MPI process. 1903- `memcheck` will not find certain array access that violate static array 1904 declarations so if memcheck runs clean you can try the `--tool=exp-ptrcheck` 1905 instead. 1906::: 1907 1908You might also consider using <http://drmemory.org> which has support for GNU/Linux, Apple 1909Mac OS and Microsoft Windows machines. (Note we haven't tried this ourselves). 1910 1911(zeropivot)= 1912 1913### What does "detected zero pivot in LU factorization" mean? 1914 1915A zero pivot in LU, ILU, Cholesky, or ICC sparse factorization does not always mean that 1916the matrix is singular. You can use 1917 1918```text 1919-pc_factor_shift_type nonzero -pc_factor_shift_amount [amount] 1920``` 1921 1922or 1923 1924```text 1925-pc_factor_shift_type positive_definite -[level]_pc_factor_shift_type nonzero 1926 -pc_factor_shift_amount [amount] 1927``` 1928 1929or 1930 1931```text 1932-[level]_pc_factor_shift_type positive_definite 1933``` 1934 1935to prevent the zero pivot. [level] is "sub" when lu, ilu, cholesky, or icc are employed in 1936each individual block of the bjacobi or ASM preconditioner. [level] is "mg_levels" or 1937"mg_coarse" when lu, ilu, cholesky, or icc are used inside multigrid smoothers or to the 1938coarse grid solver. See `PCFactorSetShiftType()`, `PCFactorSetShiftAmount()`. 1939 1940This error can also happen if your matrix is singular, see `MatSetNullSpace()` for how 1941to handle this. If this error occurs in the zeroth row of the matrix, it is likely you 1942have an error in the code that generates the matrix. 1943 1944### You create draw windows or `PETSCVIEWERDRAW` windows or use options `-ksp_monitor draw::draw_lg` or `-snes_monitor draw::draw_lg` and the program seems to run OK but windows never open 1945 1946The libraries were compiled without support for X windows. Make sure that `configure` 1947was run with the option `--with-x`. 1948 1949### The program seems to use more and more memory as it runs, even though you don't think you are allocating more memory 1950 1951Some of the following may be occurring: 1952 1953- You are creating new PETSc objects but never freeing them. 1954- There is a memory leak in PETSc or your code. 1955- Something much more subtle: (if you are using Fortran). When you declare a large array 1956 in Fortran, the operating system does not allocate all the memory pages for that array 1957 until you start using the different locations in the array. Thus, in a code, if at each 1958 step you start using later values in the array your virtual memory usage will "continue" 1959 to increase as measured by `ps` or `top`. 1960- You are running with the `-log`, `-log_mpe`, or `-log_all` option. With these 1961 options, a great deal of logging information is stored in memory until the conclusion of 1962 the run. 1963- You are linking with the MPI profiling libraries; these cause logging of all MPI 1964 activities. Another symptom is at the conclusion of the run it may print some message 1965 about writing log files. 1966 1967The following may help: 1968 1969- Run with the `-malloc_debug` option and `-malloc_view`. Or use `PetscMallocDump()` 1970 and `PetscMallocView()` sprinkled about your code to track memory that is allocated 1971 and not later freed. Use the commands `PetscMallocGetCurrentUsage()` and 1972 `PetscMemoryGetCurrentUsage()` to monitor memory allocated and 1973 `PetscMallocGetMaximumUsage()` and `PetscMemoryGetMaximumUsage()` for total memory 1974 used as the code progresses. 1975- This is just the way Unix works and is harmless. 1976- Do not use the `-log`, `-log_mpe`, or `-log_all` option, or use 1977 `PetscLogEventDeactivate()` or `PetscLogEventDeactivateClass()` to turn off logging of 1978 specific events. 1979- Make sure you do not link with the MPI profiling libraries. 1980 1981### When calling `MatPartitioningApply()` you get a message "Error! Key 16615 Not Found" 1982 1983The graph of the matrix you are using is not symmetric. You must use symmetric matrices 1984for partitioning. 1985 1986### With GMRES at restart the second residual norm printed does not match the first 1987 1988I.e. 1989 1990```text 199126 KSP Residual norm 3.421544615851e-04 199227 KSP Residual norm 2.973675659493e-04 199328 KSP Residual norm 2.588642948270e-04 199429 KSP Residual norm 2.268190747349e-04 199530 KSP Residual norm 1.977245964368e-04 199630 KSP Residual norm 1.994426291979e-04 <----- At restart the residual norm is printed a second time 1997``` 1998 1999This is actually not surprising! GMRES computes the norm of the residual at each iteration 2000via a recurrence relation between the norms of the residuals at the previous iterations 2001and quantities computed at the current iteration. It does not compute it via directly 2002$|| b - A x^{n} ||$. 2003 2004Sometimes, especially with an ill-conditioned matrix, or computation of the matrix-vector 2005product via differencing, the residual norms computed by GMRES start to "drift" from the 2006correct values. At the restart, we compute the residual norm directly, hence the "strange 2007stuff," the difference printed. The drifting, if it remains small, is harmless (doesn't 2008affect the accuracy of the solution that GMRES computes). 2009 2010If you use a more powerful preconditioner the drift will often be smaller and less 2011noticeable. Of if you are running matrix-free you may need to tune the matrix-free 2012parameters. 2013 2014### Why do some Krylov methods seem to print two residual norms per iteration? 2015 2016I.e. 2017 2018```text 20191198 KSP Residual norm 1.366052062216e-04 20201198 KSP Residual norm 1.931875025549e-04 20211199 KSP Residual norm 1.366026406067e-04 20221199 KSP Residual norm 1.931819426344e-04 2023``` 2024 2025Some Krylov methods, for example `KSPTFQMR`, actually have a "sub-iteration" of size 2 2026inside the loop. Each of the two substeps has its own matrix vector product and 2027application of the preconditioner and updates the residual approximations. This is why you 2028get this "funny" output where it looks like there are two residual norms per 2029iteration. You can also think of it as twice as many iterations. 2030 2031### Unable to locate PETSc dynamic library `libpetsc` 2032 2033When using DYNAMIC libraries - the libraries cannot be moved after they are 2034installed. This could also happen on clusters - where the paths are different on the (run) 2035nodes - than on the (compile) front-end. **Do not use dynamic libraries & shared 2036libraries**. Run `configure` with 2037`--with-shared-libraries=0 --with-dynamic-loading=0`. 2038 2039:::{important} 2040This option has been removed in PETSc v3.5 2041::: 2042 2043### How do I determine what update to PETSc broke my code? 2044 2045If at some point (in PETSc code history) you had a working code - but the latest PETSc 2046code broke it, its possible to determine the PETSc code change that might have caused this 2047behavior. This is achieved by: 2048 2049- Using Git to access PETSc sources 2050- Knowing the Git commit for the known working version of PETSc 2051- Knowing the Git commit for the known broken version of PETSc 2052- Using the [bisect](https://mirrors.edge.kernel.org/pub/software/scm/git/docs/git-bisect.html) 2053 functionality of Git 2054 2055Git bisect can be done as follows: 2056 20571. Get PETSc development (main branch in git) sources 2058 2059 ```console 2060 $ git clone https://gitlab.com/petsc/petsc.git 2061 ``` 2062 20632. Find the good and bad markers to start the bisection process. This can be done either 2064 by checking `git log` or `gitk` or <https://gitlab.com/petsc/petsc> or the web 2065 history of petsc-release clones. Lets say the known bad commit is 21af4baa815c and 2066 known good commit is 5ae5ab319844. 2067 20683. Start the bisection process with these known revisions. Build PETSc, and test your code 2069 to confirm known good/bad behavior: 2070 2071 ```console 2072 $ git bisect start 21af4baa815c 5ae5ab319844 2073 ``` 2074 2075 build/test, perhaps discover that this new state is bad 2076 2077 ```console 2078 $ git bisect bad 2079 ``` 2080 2081 build/test, perhaps discover that this state is good 2082 2083 ```console 2084 $ git bisect good 2085 ``` 2086 2087 Now until done - keep bisecting, building PETSc, and testing your code with it and 2088 determine if the code is working or not. After something like 5-15 iterations, `git 2089 bisect` will pin-point the exact code change that resulted in the difference in 2090 application behavior. 2091 2092:::{tip} 2093See [git-bisect(1)](https://mirrors.edge.kernel.org/pub/software/scm/git/docs/git-bisect.html) and the 2094[debugging section of the Git Book](https://git-scm.com/book/en/Git-Tools-Debugging-with-Git) for more debugging tips. 2095::: 2096 2097### How to fix the error "PMIX Error: error in file gds_ds12_lock_pthread.c"? 2098 2099This seems to be an error when using Open MPI and OpenBLAS with threads (or perhaps other 2100packages that use threads). 2101 2102```console 2103$ export PMIX_MCA_gds=hash 2104``` 2105 2106Should resolve the problem. 2107 2108______________________________________________________________________ 2109 2110(doc_faq_sharedlibs)= 2111 2112## Shared Libraries 2113 2114### Can I install PETSc libraries as shared libraries? 2115 2116Yes. Use 2117 2118```console 2119$ ./configure --with-shared-libraries 2120``` 2121 2122### Why should I use shared libraries? 2123 2124When you link to shared libraries, the function symbols from the shared libraries are not 2125copied in the executable. This way the size of the executable is considerably smaller than 2126when using regular libraries. This helps in a couple of ways: 2127 2128- Saves disk space when more than one executable is created 2129- Improves the compile time immensely, because the compiler has to write a much smaller 2130 file (executable) to the disk. 2131 2132### How do I link to the PETSc shared libraries? 2133 2134By default, the compiler should pick up the shared libraries instead of the regular 2135ones. Nothing special should be done for this. 2136 2137### What if I want to link to the regular `.a` library files? 2138 2139You must run `configure` with the option `--with-shared-libraries=0` (you can use a 2140different `$PETSC_ARCH` for this build so you can easily switch between the two). 2141 2142### What do I do if I want to move my executable to a different machine? 2143 2144You would also need to have access to the shared libraries on this new machine. The other 2145alternative is to build the executable without shared libraries by first deleting the 2146shared libraries, and then creating the executable. 2147 2148```{bibliography} /petsc.bib 2149:filter: docname in docnames 2150``` 2151