19b92b1d3SBarry Smith(doc_faq)= 29b92b1d3SBarry Smith 39b92b1d3SBarry Smith# FAQ 49b92b1d3SBarry Smith 59b92b1d3SBarry Smith```{contents} Table Of Contents 69b92b1d3SBarry Smith:backlinks: top 79b92b1d3SBarry Smith:local: true 89b92b1d3SBarry Smith``` 99b92b1d3SBarry Smith 109b92b1d3SBarry Smith______________________________________________________________________ 119b92b1d3SBarry Smith 129b92b1d3SBarry Smith## General 139b92b1d3SBarry Smith 149b92b1d3SBarry Smith### How can I subscribe to the PETSc mailing lists? 159b92b1d3SBarry Smith 169b92b1d3SBarry SmithSee mailing list {ref}`documentation <doc_mail>` 179b92b1d3SBarry Smith 189b92b1d3SBarry Smith### Any useful books on numerical computing? 199b92b1d3SBarry Smith 209b92b1d3SBarry Smith[Bueler, PETSc for Partial Differential Equations: Numerical Solutions in C and Python](https://my.siam.org/Store/Product/viewproduct/?ProductId=32850137) 219b92b1d3SBarry Smith 229b92b1d3SBarry Smith[Oliveira and Stewart, Writing Scientific Software: A Guide to Good Style](https://www.cambridge.org/core/books/writing-scientific-software/23206704175AF868E43FE3FB399C2F53) 239b92b1d3SBarry Smith 249b92b1d3SBarry Smith(doc_faq_general_parallel)= 259b92b1d3SBarry Smith 269b92b1d3SBarry Smith### What kind of parallel computers or clusters are needed to use PETSc? Or why do I get little speedup? 279b92b1d3SBarry Smith 289b92b1d3SBarry Smith:::{important} 299b92b1d3SBarry SmithPETSc can be used with any kind of parallel system that supports MPI BUT for any decent 309b92b1d3SBarry Smithperformance one needs: 319b92b1d3SBarry Smith 329b92b1d3SBarry Smith- Fast, **low-latency** interconnect; any ethernet (even 10 GigE) simply cannot provide 339b92b1d3SBarry Smith the needed performance. 349b92b1d3SBarry Smith- High per-core **memory** performance. Each core needs to 359b92b1d3SBarry Smith have its **own** memory bandwidth of at least 2 or more gigabytes/second. Most modern 369b92b1d3SBarry Smith computers are not bottlenecked by how fast they can perform 379b92b1d3SBarry Smith calculations; rather, they are usually restricted by how quickly they can get their 389b92b1d3SBarry Smith data. 399b92b1d3SBarry Smith::: 409b92b1d3SBarry Smith 419b92b1d3SBarry SmithTo obtain good performance it is important that you know your machine! I.e. how many 429b92b1d3SBarry Smithcompute nodes (generally, how many motherboards), how many memory sockets per node and how 439b92b1d3SBarry Smithmany cores per memory socket and how much memory bandwidth for each. 449b92b1d3SBarry Smith 459b92b1d3SBarry SmithIf you do not know this and can run MPI programs with mpiexec (that is, you don't have 469b92b1d3SBarry Smithbatch system), run the following from `$PETSC_DIR`: 479b92b1d3SBarry Smith 489b92b1d3SBarry Smith```console 499b92b1d3SBarry Smith$ make streams [NPMAX=maximum_number_of_mpi_processes_you_plan_to_use] 509b92b1d3SBarry Smith``` 519b92b1d3SBarry Smith 529b92b1d3SBarry SmithThis will provide a summary of the bandwidth with different number of MPI 539b92b1d3SBarry Smithprocesses and potential speedups. See {any}`ch_streams` for a detailed discussion. 549b92b1d3SBarry Smith 559b92b1d3SBarry SmithIf you have a batch system: 569b92b1d3SBarry Smith 579b92b1d3SBarry Smith```console 589b92b1d3SBarry Smith$ cd $PETSC_DIR/src/benchmarks/streams 599b92b1d3SBarry Smith$ make MPIVersion 609b92b1d3SBarry Smithsubmit MPIVersion to the batch system a number of times with 1, 2, 3, etc. MPI processes 619b92b1d3SBarry Smithcollecting all of the output from the runs into the single file scaling.log. Copy 629b92b1d3SBarry Smithscaling.log into the src/benchmarks/streams directory. 639b92b1d3SBarry Smith$ ./process.py createfile ; process.py 649b92b1d3SBarry Smith``` 659b92b1d3SBarry Smith 669b92b1d3SBarry SmithEven if you have enough memory bandwidth if the OS switches processes between cores 679b92b1d3SBarry Smithperformance can degrade. Smart process to core/socket binding (this just means locking a 689b92b1d3SBarry Smithprocess to a particular core or memory socket) may help you. For example, consider using 699b92b1d3SBarry Smithfewer processes than cores and binding processes to separate sockets so that each process 709b92b1d3SBarry Smithuses a different memory bus: 719b92b1d3SBarry Smith 729b92b1d3SBarry Smith- [MPICH2 binding with the Hydra process manager](https://github.com/pmodels/mpich/blob/main/doc/wiki/how_to/Using_the_Hydra_Process_Manager.md#process-core-binding] 739b92b1d3SBarry Smith 749b92b1d3SBarry Smith ```console 759b92b1d3SBarry Smith $ mpiexec.hydra -n 4 --binding cpu:sockets 769b92b1d3SBarry Smith ``` 779b92b1d3SBarry Smith 789b92b1d3SBarry Smith- [Open MPI binding](https://www.open-mpi.org/faq/?category=tuning#using-paffinity) 799b92b1d3SBarry Smith 809b92b1d3SBarry Smith ```console 819b92b1d3SBarry Smith $ mpiexec -n 4 --map-by socket --bind-to socket --report-bindings 829b92b1d3SBarry Smith ``` 839b92b1d3SBarry Smith 849b92b1d3SBarry Smith- `taskset`, part of the [util-linux](https://github.com/karelzak/util-linux) package 859b92b1d3SBarry Smith 869b92b1d3SBarry Smith Check `man taskset` for details. Make sure to set affinity for **your** program, 879b92b1d3SBarry Smith **not** for the `mpiexec` program. 889b92b1d3SBarry Smith 899b92b1d3SBarry Smith- `numactl` 909b92b1d3SBarry Smith 919b92b1d3SBarry Smith In addition to task affinity, this tool also allows changing the default memory affinity 929b92b1d3SBarry Smith policy. On Linux, the default policy is to attempt to find memory on the same memory bus 939b92b1d3SBarry Smith that serves the core that a thread is running on when the memory is faulted 949b92b1d3SBarry Smith (not when `malloc()` is called). If local memory is not available, it is found 959b92b1d3SBarry Smith elsewhere, possibly leading to serious memory imbalances. 969b92b1d3SBarry Smith 979b92b1d3SBarry Smith The option `--localalloc` allocates memory on the local NUMA node, similar to the 989b92b1d3SBarry Smith `numa_alloc_local()` function in the `libnuma` library. The option 999b92b1d3SBarry Smith `--cpunodebind=nodes` binds the process to a given NUMA node (note that this can be 1009b92b1d3SBarry Smith larger or smaller than a CPU (socket); a NUMA node usually has multiple cores). 1019b92b1d3SBarry Smith 1029b92b1d3SBarry Smith The option `--physcpubind=cpus` binds the process to a given processor core (numbered 1039b92b1d3SBarry Smith according to `/proc/cpuinfo`, therefore including logical cores if Hyper-threading is 1049b92b1d3SBarry Smith enabled). 1059b92b1d3SBarry Smith 1069b92b1d3SBarry Smith With Open MPI, you can use knowledge of the NUMA hierarchy and core numbering on your 1079b92b1d3SBarry Smith machine to calculate the correct NUMA node or processor number given the environment 1089b92b1d3SBarry Smith variable `OMPI_COMM_WORLD_LOCAL_RANK`. In most cases, it is easier to make mpiexec or 1099b92b1d3SBarry Smith a resource manager set affinities. 1109b92b1d3SBarry Smith 1119b92b1d3SBarry Smith### What kind of license is PETSc released under? 1129b92b1d3SBarry Smith 1139b92b1d3SBarry SmithSee licensing {ref}`documentation <doc_license>` 1149b92b1d3SBarry Smith 1159b92b1d3SBarry Smith### Why is PETSc written in C, instead of Fortran or C++? 1169b92b1d3SBarry Smith 1179b92b1d3SBarry SmithWhen this decision was made, in the early 1990s, C enabled us to build data structures 1189b92b1d3SBarry Smithfor storing sparse matrices, solver information, 1199b92b1d3SBarry Smithetc. in ways that Fortran simply did not allow. ANSI C was a complete standard that all 1209b92b1d3SBarry Smithmodern C compilers supported. The language was identical on all machines. C++ was still 1219b92b1d3SBarry Smithevolving and compilers on different machines were not identical. Using C function pointers 1229b92b1d3SBarry Smithto provide data encapsulation and polymorphism allowed us to get many of the advantages of 1239b92b1d3SBarry SmithC++ without using such a large and more complicated language. It would have been natural and 1249b92b1d3SBarry Smithreasonable to have coded PETSc in C++; we opted to use C instead. 1259b92b1d3SBarry Smith 1269b92b1d3SBarry Smith### Does all the PETSc error checking and logging reduce PETSc's efficiency? 1279b92b1d3SBarry Smith 1289b92b1d3SBarry SmithNo 1299b92b1d3SBarry Smith 1309b92b1d3SBarry Smith(doc_faq_maintenance_strats)= 1319b92b1d3SBarry Smith 1329b92b1d3SBarry Smith### How do such a small group of people manage to write and maintain such a large and marvelous package as PETSc? 1339b92b1d3SBarry Smith 1349b92b1d3SBarry Smith1. **We work very efficiently**. 1359b92b1d3SBarry Smith 1369b92b1d3SBarry Smith - We use powerful editors and programming environments. 1379b92b1d3SBarry Smith - Our manual pages are generated automatically from formatted comments in the code, 1389b92b1d3SBarry Smith thus alleviating the need for creating and maintaining manual pages. 1399b92b1d3SBarry Smith - We employ continuous integration testing of the entire PETSc library on many different 1409b92b1d3SBarry Smith machine architectures. This process **significantly** protects (no bug-catching 1419b92b1d3SBarry Smith process is perfect) against inadvertently introducing bugs with new additions. Every 1429b92b1d3SBarry Smith new feature **must** pass our suite of thousands of tests as well as formal code 1439b92b1d3SBarry Smith review before it may be included. 1449b92b1d3SBarry Smith 1459b92b1d3SBarry Smith2. **We are very careful in our design (and are constantly revising our design)** 1469b92b1d3SBarry Smith 1479b92b1d3SBarry Smith - PETSc as a package should be easy to use, write, and maintain. Our mantra is to write 1489b92b1d3SBarry Smith code like everyone is using it. 1499b92b1d3SBarry Smith 1509b92b1d3SBarry Smith3. **We are willing to do the grunt work** 1519b92b1d3SBarry Smith 1529b92b1d3SBarry Smith - PETSc is regularly checked to make sure that all code conforms to our interface 1539b92b1d3SBarry Smith design. We will never keep in a bad design decision simply because changing it will 1549b92b1d3SBarry Smith require a lot of editing; we do a lot of editing. 1559b92b1d3SBarry Smith 1569b92b1d3SBarry Smith4. **We constantly seek out and experiment with new design ideas** 1579b92b1d3SBarry Smith 1589b92b1d3SBarry Smith - We retain the useful ones and discard the rest. All of these decisions are based not 1599b92b1d3SBarry Smith just on performance, but also on **practicality**. 1609b92b1d3SBarry Smith 1619b92b1d3SBarry Smith5. **Function and variable names must adhere to strict guidelines** 1629b92b1d3SBarry Smith 1639b92b1d3SBarry Smith - Even the rules about capitalization are designed to make it easy to figure out the 1649b92b1d3SBarry Smith name of a particular object or routine. Our memories are terrible, so careful 1659b92b1d3SBarry Smith consistent naming puts less stress on our limited human RAM. 1669b92b1d3SBarry Smith 1679b92b1d3SBarry Smith6. **The PETSc directory tree is designed to make it easy to move throughout the 1689b92b1d3SBarry Smith entire package** 1699b92b1d3SBarry Smith 1709b92b1d3SBarry Smith7. **We have a rich, robust, and fast bug reporting system** 1719b92b1d3SBarry Smith 1729b92b1d3SBarry Smith - <mailto:petsc-maint@mcs.anl.gov> is always checked, and we pride ourselves on responding 1739b92b1d3SBarry Smith quickly and accurately. Email is very lightweight, and so bug reports system retains 1749b92b1d3SBarry Smith an archive of all reported problems and fixes, so it is easy to re-find fixes to 1759b92b1d3SBarry Smith previously discovered problems. 1769b92b1d3SBarry Smith 1779b92b1d3SBarry Smith8. **We contain the complexity of PETSc by using powerful object-oriented programming 1789b92b1d3SBarry Smith techniques** 1799b92b1d3SBarry Smith 1809b92b1d3SBarry Smith - Data encapsulation serves to abstract complex data formats or movement to 1819b92b1d3SBarry Smith human-readable format. This is why your program cannot, for example, look directly 1829b92b1d3SBarry Smith at what is inside the object `Mat`. 1839b92b1d3SBarry Smith - Polymorphism makes changing program behavior as easy as possible, and further 1849b92b1d3SBarry Smith abstracts the *intent* of your program from what is *written* in code. You call 1859b92b1d3SBarry Smith `MatMult()` regardless of whether your matrix is dense, sparse, parallel or 1869b92b1d3SBarry Smith sequential; you don't call a different routine for each format. 1879b92b1d3SBarry Smith 1889b92b1d3SBarry Smith9. **We try to provide the functionality requested by our users** 1899b92b1d3SBarry Smith 1909b92b1d3SBarry Smith### For complex numbers will I get better performance with C++? 1919b92b1d3SBarry Smith 1929b92b1d3SBarry SmithTo use PETSc with complex numbers you may use the following `configure` options; 1939b92b1d3SBarry Smith`--with-scalar-type=complex` and either `--with-clanguage=c++` or (the default) 1949b92b1d3SBarry Smith`--with-clanguage=c`. In our experience they will deliver very similar performance 1959b92b1d3SBarry Smith(speed), but if one is concerned they should just try both and see if one is faster. 1969b92b1d3SBarry Smith 1979b92b1d3SBarry Smith### How come when I run the same program on the same number of processes I get a "different" answer? 1989b92b1d3SBarry Smith 1999b92b1d3SBarry SmithInner products and norms in PETSc are computed using the `MPI_Allreduce()` command. For 2009b92b1d3SBarry Smithdifferent runs the order at which values arrive at a given process (via MPI) can be in a 2019b92b1d3SBarry Smithdifferent order, thus the order in which some floating point arithmetic operations are 2029b92b1d3SBarry Smithperformed will be different. Since floating point arithmetic is not 2039b92b1d3SBarry Smithassociative, the computed quantity may be slightly different. 2049b92b1d3SBarry Smith 2059b92b1d3SBarry SmithOver a run the many slight differences in the inner products and norms will effect all the 2069b92b1d3SBarry Smithcomputed results. It is important to realize that none of the computed answers are any 2079b92b1d3SBarry Smithless right or wrong (in fact the sequential computation is no more right then the parallel 2089b92b1d3SBarry Smithones). All answers are equal, but some are more equal than others. 2099b92b1d3SBarry Smith 2109b92b1d3SBarry SmithThe discussion above assumes that the exact same algorithm is being used on the different 2119b92b1d3SBarry Smithnumber of processes. When the algorithm is different for the different number of processes 2129b92b1d3SBarry Smith(almost all preconditioner algorithms except Jacobi are different for different number of 2139b92b1d3SBarry Smithprocesses) then one expects to see (and does) a greater difference in results for 2149b92b1d3SBarry Smithdifferent numbers of processes. In some cases (for example block Jacobi preconditioner) it 2159b92b1d3SBarry Smithmay be that the algorithm works for some number of processes and does not work for others. 2169b92b1d3SBarry Smith 2179b92b1d3SBarry Smith### How come when I run the same linear solver on a different number of processes it takes a different number of iterations? 2189b92b1d3SBarry Smith 2199b92b1d3SBarry SmithThe convergence of many of the preconditioners in PETSc including the default parallel 2209b92b1d3SBarry Smithpreconditioner block Jacobi depends on the number of processes. The more processes the 2219b92b1d3SBarry Smith(slightly) slower convergence it has. This is the nature of iterative solvers, the more 2229b92b1d3SBarry Smithparallelism means the more "older" information is used in the solution process hence 2239b92b1d3SBarry Smithslower convergence. 2249b92b1d3SBarry Smith 2259b92b1d3SBarry Smith(doc_faq_gpuhowto)= 2269b92b1d3SBarry Smith 2279b92b1d3SBarry Smith### Can PETSc use GPUs to speed up computations? 2289b92b1d3SBarry Smith 2299b92b1d3SBarry Smith:::{seealso} 2309b92b1d3SBarry SmithSee GPU development {ref}`roadmap <doc_gpu_roadmap>` for the latest information 2319b92b1d3SBarry Smithregarding the state of PETSc GPU integration. 2329b92b1d3SBarry Smith 2339b92b1d3SBarry SmithSee GPU install {ref}`documentation <doc_config_accel>` for up-to-date information on 2349b92b1d3SBarry Smithinstalling PETSc to use GPU's. 2359b92b1d3SBarry Smith::: 2369b92b1d3SBarry Smith 2379b92b1d3SBarry SmithQuick summary of usage with CUDA: 2389b92b1d3SBarry Smith 2399b92b1d3SBarry Smith- The `VecType` `VECSEQCUDA`, `VECMPICUDA`, or `VECCUDA` may be used with 2409b92b1d3SBarry Smith `VecSetType()` or `-vec_type seqcuda`, `mpicuda`, or `cuda` when 2419b92b1d3SBarry Smith `VecSetFromOptions()` is used. 2429b92b1d3SBarry Smith- The `MatType` `MATSEQAIJCUSPARSE`, `MATMPIAIJCUSPARSE`, or `MATAIJCUSPARSE` 2439b92b1d3SBarry Smith may be used with `MatSetType()` or `-mat_type seqaijcusparse`, `mpiaijcusparse`, or 2449b92b1d3SBarry Smith `aijcusparse` when `MatSetFromOptions()` is used. 2459b92b1d3SBarry Smith- If you are creating the vectors and matrices with a `DM`, you can use `-dm_vec_type 2469b92b1d3SBarry Smith cuda` and `-dm_mat_type aijcusparse`. 2479b92b1d3SBarry Smith 2489b92b1d3SBarry SmithQuick summary of usage with OpenCL (provided by the ViennaCL library): 2499b92b1d3SBarry Smith 2509b92b1d3SBarry Smith- The `VecType` `VECSEQVIENNACL`, `VECMPIVIENNACL`, or `VECVIENNACL` may be used 2519b92b1d3SBarry Smith with `VecSetType()` or `-vec_type seqviennacl`, `mpiviennacl`, or `viennacl` 2529b92b1d3SBarry Smith when `VecSetFromOptions()` is used. 2539b92b1d3SBarry Smith- The `MatType` `MATSEQAIJVIENNACL`, `MATMPIAIJVIENNACL`, or `MATAIJVIENNACL` 2549b92b1d3SBarry Smith may be used with `MatSetType()` or `-mat_type seqaijviennacl`, `mpiaijviennacl`, or 2559b92b1d3SBarry Smith `aijviennacl` when `MatSetFromOptions()` is used. 2569b92b1d3SBarry Smith- If you are creating the vectors and matrices with a `DM`, you can use `-dm_vec_type 2579b92b1d3SBarry Smith viennacl` and `-dm_mat_type aijviennacl`. 2589b92b1d3SBarry Smith 2599b92b1d3SBarry SmithGeneral hints: 2609b92b1d3SBarry Smith 2619b92b1d3SBarry Smith- It is useful to develop your code with the default vectors and then run production runs 2629b92b1d3SBarry Smith with the command line options to use the GPU since debugging on GPUs is difficult. 2639b92b1d3SBarry Smith- All of the Krylov methods except `KSPIBCGS` run on the GPU. 2649b92b1d3SBarry Smith- Parts of most preconditioners run directly on the GPU. After setup, `PCGAMG` runs 2659b92b1d3SBarry Smith fully on GPUs, without any memory copies between the CPU and GPU. 2669b92b1d3SBarry Smith 2679b92b1d3SBarry SmithSome GPU systems (for example many laptops) only run with single precision; thus, PETSc 2689b92b1d3SBarry Smithmust be built with the `configure` option `--with-precision=single`. 2699b92b1d3SBarry Smith 2709b92b1d3SBarry Smith(doc_faq_extendedprecision)= 2719b92b1d3SBarry Smith 2729b92b1d3SBarry Smith### Can I run PETSc with extended precision? 2739b92b1d3SBarry Smith 2749b92b1d3SBarry SmithYes, with gcc and gfortran. `configure` PETSc using the 2759b92b1d3SBarry Smithoptions `--with-precision=__float128` and `--download-f2cblaslapack`. 2769b92b1d3SBarry Smith 2779b92b1d3SBarry Smith:::{admonition} Warning 2789b92b1d3SBarry Smith:class: yellow 2799b92b1d3SBarry Smith 2809b92b1d3SBarry SmithExternal packages are not guaranteed to work in this mode! 2819b92b1d3SBarry Smith::: 2829b92b1d3SBarry Smith 2839b92b1d3SBarry Smith### Why doesn't PETSc use Qd to implement support for extended precision? 2849b92b1d3SBarry Smith 2859b92b1d3SBarry SmithWe tried really hard but could not. The problem is that the QD c++ classes, though they 2869b92b1d3SBarry Smithtry to, implement the built-in data types of `double` are not native types and cannot 2879b92b1d3SBarry Smith"just be used" in a general piece of numerical source code. Ratherm the code has to 2889b92b1d3SBarry Smithrewritten to live within the limitations of QD classes. However PETSc can be built to use 2899b92b1d3SBarry Smithquad precision, as detailed {ref}`here <doc_faq_extendedprecision>`. 2909b92b1d3SBarry Smith 2919b92b1d3SBarry Smith### How do I cite PETSc? 2929b92b1d3SBarry Smith 2939b92b1d3SBarry SmithUse {any}`these citations <doc_index_citing_petsc>`. 2949b92b1d3SBarry Smith 2959b92b1d3SBarry Smith______________________________________________________________________ 2969b92b1d3SBarry Smith 2979b92b1d3SBarry Smith## Installation 2989b92b1d3SBarry Smith 2999b92b1d3SBarry Smith### How do I begin using PETSc if the software has already been completely built and installed by someone else? 3009b92b1d3SBarry Smith 3019b92b1d3SBarry SmithAssuming that the PETSc libraries have been successfully built for a particular 3029b92b1d3SBarry Smitharchitecture and level of optimization, a new user must merely: 3039b92b1d3SBarry Smith 3049b92b1d3SBarry Smith1. Set `PETSC_DIR` to the full path of the PETSc home 3059b92b1d3SBarry Smith directory. This will be the location of the `configure` script, and usually called 3069b92b1d3SBarry Smith "petsc" or some variation of that (for example, /home/username/petsc). 3079b92b1d3SBarry Smith2. Set `PETSC_ARCH`, which indicates the configuration on which PETSc will be 3089b92b1d3SBarry Smith used. Note that `$PETSC_ARCH` is simply a name the installer used when installing 3099b92b1d3SBarry Smith the libraries. There will exist a directory within `$PETSC_DIR` that is named after 3109b92b1d3SBarry Smith its corresponding `$PETSC_ARCH`. There many be several on a single system, for 3119b92b1d3SBarry Smith example "linux-c-debug" for the debug versions compiled by a C compiler or 3129b92b1d3SBarry Smith "linux-c-opt" for the optimized version. 3139b92b1d3SBarry Smith 3149b92b1d3SBarry Smith:::{admonition} Still Stuck? 3159b92b1d3SBarry SmithSee the {ref}`quick-start tutorial <tut_install>` for a step-by-step guide on 3169b92b1d3SBarry Smithinstalling PETSc, in case you have missed a step. 3179b92b1d3SBarry Smith 3187f296bb3SBarry SmithSee the users manual section on {ref}`getting started <sec_getting_started>`. 3199b92b1d3SBarry Smith::: 3209b92b1d3SBarry Smith 3219b92b1d3SBarry Smith### The PETSc distribution is SO Large. How can I reduce my disk space usage? 3229b92b1d3SBarry Smith 3239b92b1d3SBarry Smith1. The PETSc users manual is provided in PDF format at `$PETSC_DIR/manual.pdf`. You 3249b92b1d3SBarry Smith can delete this. 3259b92b1d3SBarry Smith2. The PETSc test suite contains sample output for many of the examples. These are 3269b92b1d3SBarry Smith contained in the PETSc directories `$PETSC_DIR/src/*/tutorials/output` and 3279b92b1d3SBarry Smith `$PETSC_DIR/src/*/tests/output`. Once you have run the test examples, you may remove 3289b92b1d3SBarry Smith all of these directories to save some disk space. You can locate the largest with 3299b92b1d3SBarry Smith e.g. `find . -name output -type d | xargs du -sh | sort -hr` on a Unix-based system. 3309b92b1d3SBarry Smith3. The debugging versions of the libraries are larger than the optimized versions. In a 3319b92b1d3SBarry Smith pinch you can work with the optimized version, although we bid you good luck in 3329b92b1d3SBarry Smith finding bugs as it is much easier with the debug version. 3339b92b1d3SBarry Smith 3349b92b1d3SBarry Smith### I want to use PETSc only for uniprocessor programs. Must I still install and use a version of MPI? 3359b92b1d3SBarry Smith 3369b92b1d3SBarry SmithNo, run `configure` with the option `--with-mpi=0` 3379b92b1d3SBarry Smith 3389b92b1d3SBarry Smith### Can I install PETSc to not use X windows (either under Unix or Microsoft Windows with GCC)? 3399b92b1d3SBarry Smith 3409b92b1d3SBarry SmithYes. Run `configure` with the additional flag `--with-x=0` 3419b92b1d3SBarry Smith 3429b92b1d3SBarry Smith### Why do you use MPI? 3439b92b1d3SBarry Smith 3449b92b1d3SBarry SmithMPI is the message-passing standard. Because it is a standard, it will not frequently change over 3459b92b1d3SBarry Smithtime; thus, we do not have to change PETSc every time the provider of the message-passing 3469b92b1d3SBarry Smithsystem decides to make an interface change. MPI was carefully designed by experts from 3479b92b1d3SBarry Smithindustry, academia, and government labs to provide the highest quality performance and 3489b92b1d3SBarry Smithcapability. 3499b92b1d3SBarry Smith 3509b92b1d3SBarry SmithFor example, the careful design of communicators in MPI allows the easy nesting of 3519b92b1d3SBarry Smithdifferent libraries; no other message-passing system provides this support. All of the 3529b92b1d3SBarry Smithmajor parallel computer vendors were involved in the design of MPI and have committed to 3539b92b1d3SBarry Smithproviding quality implementations. 3549b92b1d3SBarry Smith 3559b92b1d3SBarry SmithIn addition, since MPI is a standard, several different groups have already provided 3569b92b1d3SBarry Smithcomplete free implementations. Thus, one does not have to rely on the technical skills of 3579b92b1d3SBarry Smithone particular group to provide the message-passing libraries. Today, MPI is the only 3589b92b1d3SBarry Smithpractical, portable approach to writing efficient parallel numerical software. 3599b92b1d3SBarry Smith 3609b92b1d3SBarry Smith(invalid_mpi_compilers)= 3619b92b1d3SBarry Smith 3629b92b1d3SBarry Smith### What do I do if my MPI compiler wrappers are invalid? 3639b92b1d3SBarry Smith 3649b92b1d3SBarry SmithMost MPI implementations provide compiler wrappers (such as `mpicc`) which give the 3659b92b1d3SBarry Smithinclude and link options necessary to use that version of MPI to the underlying compilers 3669b92b1d3SBarry Smith. Configuration will fail if these wrappers are either absent or broken in the MPI pointed to by 3679b92b1d3SBarry Smith`--with-mpi-dir`. You can rerun the configure with the additional option 3689b92b1d3SBarry Smith`--with-mpi-compilers=0`, which will try to auto-detect working compilers; however, 3699b92b1d3SBarry Smiththese compilers may be incompatible with the particular MPI build. If this fix does not 3709b92b1d3SBarry Smithwork, run with `--with-cc=[your_c_compiler]` where you know `your_c_compiler` works 3719b92b1d3SBarry Smithwith this particular MPI, and likewise for C++ (`--with-cxx=[your_cxx_compiler]`) and Fortran 3729b92b1d3SBarry Smith(`--with-fc=[your_fortran_compiler]`). 3739b92b1d3SBarry Smith 3749b92b1d3SBarry Smith(bit_indices)= 3759b92b1d3SBarry Smith 3769b92b1d3SBarry Smith### When should/can I use the `configure` option `--with-64-bit-indices`? 3779b92b1d3SBarry Smith 3789b92b1d3SBarry SmithBy default the type that PETSc uses to index into arrays and keep sizes of arrays is a 3799b92b1d3SBarry Smith`PetscInt` defined to be a 32-bit `int`. If your problem: 3809b92b1d3SBarry Smith 3819b92b1d3SBarry Smith- Involves more than 2^31 - 1 unknowns (around 2 billion). 3829b92b1d3SBarry Smith- Your matrix might contain more than 2^31 - 1 nonzeros on a single process. 3839b92b1d3SBarry Smith 3849b92b1d3SBarry SmithThen you need to use this option. Otherwise you will get strange crashes. 3859b92b1d3SBarry Smith 3869b92b1d3SBarry SmithThis option can be used when you are using either 32 or 64-bit pointers. You do not 3879b92b1d3SBarry Smithneed to use this option if you are using 64-bit pointers unless the two conditions above 3889b92b1d3SBarry Smithhold. 3899b92b1d3SBarry Smith 3909b92b1d3SBarry Smith### What if I get an internal compiler error? 3919b92b1d3SBarry Smith 3929b92b1d3SBarry SmithYou can rebuild the offending file individually with a lower optimization level. **Then 3939b92b1d3SBarry Smithmake sure to complain to the compiler vendor and file a bug report**. For example, if the 3949b92b1d3SBarry Smithcompiler chokes on `src/mat/impls/baij/seq/baijsolvtrannat.c` you can run the following 3959b92b1d3SBarry Smithfrom `$PETSC_DIR`: 3969b92b1d3SBarry Smith 3979b92b1d3SBarry Smith```console 3989b92b1d3SBarry Smith$ make -f gmakefile PCC_FLAGS="-O1" $PETSC_ARCH/obj/src/mat/impls/baij/seq/baijsolvtrannat.o 3999b92b1d3SBarry Smith$ make all 4009b92b1d3SBarry Smith``` 4019b92b1d3SBarry Smith 4029b92b1d3SBarry Smith### How do I enable Python bindings (petsc4py) with PETSc? 4039b92b1d3SBarry Smith 4049b92b1d3SBarry Smith1. Install [Cython](https://cython.org/). 4059b92b1d3SBarry Smith2. `configure` PETSc with the `--with-petsc4py=1` option. 4069b92b1d3SBarry Smith3. set `PYTHONPATH=$PETSC_DIR/$PETSC_ARCH/lib` 4079b92b1d3SBarry Smith 4089b92b1d3SBarry Smith(macos_gfortran)= 4099b92b1d3SBarry Smith 4109b92b1d3SBarry Smith### What Fortran compiler do you recommend on macOS? 4119b92b1d3SBarry Smith 4129b92b1d3SBarry SmithWe recommend using [homebrew](https://brew.sh/) to install [gfortran](https://gcc.gnu.org/wiki/GFortran), see {any}`doc_macos_install` 4139b92b1d3SBarry Smith 4149b92b1d3SBarry Smith### How can I find the URL locations of the packages you install using `--download-PACKAGE`? 4159b92b1d3SBarry Smith 4169b92b1d3SBarry Smith```console 4179b92b1d3SBarry Smith$ grep "self.download " $PETSC_DIR/config/BuildSystem/config/packages/*.py 4189b92b1d3SBarry Smith``` 4199b92b1d3SBarry Smith 4209b92b1d3SBarry Smith### How to fix the problem: PETSc was configured with one MPICH (or Open MPI) `mpi.h` version but now appears to be compiling using a different MPICH (or Open MPI) `mpi.h` version 4219b92b1d3SBarry Smith 4229b92b1d3SBarry SmithThis happens for generally one of two reasons: 4239b92b1d3SBarry Smith 4249b92b1d3SBarry Smith- You previously ran `configure` with the option `--download-mpich` (or `--download-openmpi`) 4259b92b1d3SBarry Smith but later ran `configure` to use a version of MPI already installed on the 4269b92b1d3SBarry Smith machine. Solution: 4279b92b1d3SBarry Smith 4289b92b1d3SBarry Smith ```console 4299b92b1d3SBarry Smith $ rm -rf $PETSC_DIR/$PETSC_ARCH 4309b92b1d3SBarry Smith $ ./configure --your-args 4319b92b1d3SBarry Smith ``` 4329b92b1d3SBarry Smith 4339b92b1d3SBarry Smith(mpi_network_misconfigure)= 4349b92b1d3SBarry Smith 4359b92b1d3SBarry Smith### What does it mean when `make check` hangs or errors on `PetscOptionsInsertFile()`? 4369b92b1d3SBarry Smith 4379b92b1d3SBarry SmithFor example: 4389b92b1d3SBarry Smith 4399b92b1d3SBarry Smith```none 4409b92b1d3SBarry SmithPossible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes 4419b92b1d3SBarry SmithSee https://petsc.org/release/faq/ 4429b92b1d3SBarry Smith[0]PETSC ERROR: #1 PetscOptionsInsertFile() line 563 in /Users/barrysmith/Src/PETSc/src/sys/objects/options.c 4439b92b1d3SBarry Smith[0]PETSC ERROR: #2 PetscOptionsInsert() line 720 in /Users/barrysmith/Src/PETSc/src/sys/objects/options.c 4449b92b1d3SBarry Smith[0]PETSC ERROR: #3 PetscInitialize() line 828 in /Users/barrysmith/Src/PETSc/src/sys/objects/pinit.c 4459b92b1d3SBarry Smith``` 4469b92b1d3SBarry Smith 4479b92b1d3SBarry Smithor 4489b92b1d3SBarry Smith 4499b92b1d3SBarry Smith```none 4509b92b1d3SBarry Smith$ make check 4519b92b1d3SBarry SmithRunning check examples to verify correct installation 4529b92b1d3SBarry SmithUsing PETSC_DIR=/Users/barrysmith/Src/petsc and PETSC_ARCH=arch-fix-mpiexec-hang-2-ranks 4539b92b1d3SBarry SmithC/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process 4549b92b1d3SBarry SmithPROGRAM SEEMS TO BE HANGING HERE 4559b92b1d3SBarry Smith``` 4569b92b1d3SBarry Smith 4579b92b1d3SBarry SmithThis usually occurs when network settings are misconfigured (perhaps due to VPN) resulting in a failure or hang in system call `gethostbyname()`. 4589b92b1d3SBarry Smith 4599b92b1d3SBarry Smith- Verify you are using the correct `mpiexec` for the MPI you have linked PETSc with. 4609b92b1d3SBarry Smith 4619b92b1d3SBarry Smith- If you have a VPN enabled on your machine, try turning it off and then running `make check` to 4629b92b1d3SBarry Smith verify that it is not the VPN playing poorly with MPI. 4639b92b1d3SBarry Smith 4647f296bb3SBarry Smith- If ``ping `hostname` `` (`/sbin/ping` on macOS) fails or hangs do: 4659b92b1d3SBarry Smith 4669b92b1d3SBarry Smith ```none 4679b92b1d3SBarry Smith echo 127.0.0.1 `hostname` | sudo tee -a /etc/hosts 4689b92b1d3SBarry Smith ``` 4699b92b1d3SBarry Smith 4709b92b1d3SBarry Smith and try `make check` again. 4719b92b1d3SBarry Smith 4729b92b1d3SBarry Smith- Try completely disconnecting your machine from the network and see if `make check` then works 4739b92b1d3SBarry Smith 4749b92b1d3SBarry Smith- Try the PETSc `configure` option `--download-mpich-device=ch3:nemesis` with `--download-mpich`. 4759b92b1d3SBarry Smith 4769b92b1d3SBarry Smith______________________________________________________________________ 4779b92b1d3SBarry Smith 4789b92b1d3SBarry Smith## Usage 4799b92b1d3SBarry Smith 4809b92b1d3SBarry Smith### How can I redirect PETSc's `stdout` and `stderr` when programming with a GUI interface in Windows Developer Studio or to C++ streams? 4819b92b1d3SBarry Smith 4829b92b1d3SBarry SmithTo overload just the error messages write your own `MyPrintError()` function that does 4839b92b1d3SBarry Smithwhatever you want (including pop up windows etc) and use it like below. 4849b92b1d3SBarry Smith 4859b92b1d3SBarry Smith```c 4869b92b1d3SBarry Smithextern "C" { 4879b92b1d3SBarry Smith int PASCAL WinMain(HINSTANCE,HINSTANCE,LPSTR,int); 4889b92b1d3SBarry Smith}; 4899b92b1d3SBarry Smith 4909b92b1d3SBarry Smith#include <petscsys.h> 4919b92b1d3SBarry Smith#include <mpi.h> 4929b92b1d3SBarry Smith 4939b92b1d3SBarry Smithconst char help[] = "Set up from main"; 4949b92b1d3SBarry Smith 4959b92b1d3SBarry Smithint MyPrintError(const char error[], ...) 4969b92b1d3SBarry Smith{ 4979b92b1d3SBarry Smith printf("%s", error); 4989b92b1d3SBarry Smith return 0; 4999b92b1d3SBarry Smith} 5009b92b1d3SBarry Smith 5019b92b1d3SBarry Smithint main(int ac, char *av[]) 5029b92b1d3SBarry Smith{ 5039b92b1d3SBarry Smith char buf[256]; 5049b92b1d3SBarry Smith HINSTANCE inst; 5059b92b1d3SBarry Smith PetscErrorCode ierr; 5069b92b1d3SBarry Smith 5079b92b1d3SBarry Smith inst = (HINSTANCE)GetModuleHandle(NULL); 5089b92b1d3SBarry Smith PetscErrorPrintf = MyPrintError; 5099b92b1d3SBarry Smith 5109b92b1d3SBarry Smith buf[0] = 0; 5119b92b1d3SBarry Smith for (int i = 1; i < ac; ++i) { 5129b92b1d3SBarry Smith strcat(buf, av[i]); 5139b92b1d3SBarry Smith strcat(buf, " "); 5149b92b1d3SBarry Smith } 5159b92b1d3SBarry Smith 5169b92b1d3SBarry Smith PetscCall(PetscInitialize(&ac, &av, NULL, help)); 5179b92b1d3SBarry Smith 5189b92b1d3SBarry Smith return WinMain(inst, NULL, buf, SW_SHOWNORMAL); 5199b92b1d3SBarry Smith} 5209b92b1d3SBarry Smith``` 5219b92b1d3SBarry Smith 5229b92b1d3SBarry SmithPlace this file in the project and compile with this preprocessor definitions: 5239b92b1d3SBarry Smith 5249b92b1d3SBarry Smith``` 5259b92b1d3SBarry SmithWIN32 5269b92b1d3SBarry Smith_DEBUG 5279b92b1d3SBarry Smith_CONSOLE 5289b92b1d3SBarry Smith_MBCS 5299b92b1d3SBarry SmithUSE_PETSC_LOG 5309b92b1d3SBarry SmithUSE_PETSC_BOPT_g 5319b92b1d3SBarry SmithUSE_PETSC_STACK 5329b92b1d3SBarry Smith_AFXDLL 5339b92b1d3SBarry Smith``` 5349b92b1d3SBarry Smith 5359b92b1d3SBarry SmithAnd these link options: 5369b92b1d3SBarry Smith 5379b92b1d3SBarry Smith``` 5389b92b1d3SBarry Smith/nologo 5399b92b1d3SBarry Smith/subsystem:console 5409b92b1d3SBarry Smith/incremental:yes 5419b92b1d3SBarry Smith/debug 5429b92b1d3SBarry Smith/machine:I386 5439b92b1d3SBarry Smith/nodefaultlib:"libcmtd.lib" 5449b92b1d3SBarry Smith/nodefaultlib:"libcd.lib" 5459b92b1d3SBarry Smith/nodefaultlib:"mvcrt.lib" 5469b92b1d3SBarry Smith/pdbtype:sept 5479b92b1d3SBarry Smith``` 5489b92b1d3SBarry Smith 5499b92b1d3SBarry Smith:::{note} 5509b92b1d3SBarry SmithThe above is compiled and linked as if it was a console program. The linker will search 5519b92b1d3SBarry Smithfor a main, and then from it the `WinMain` will start. This works with MFC templates and 5529b92b1d3SBarry Smithderived classes too. 5539b92b1d3SBarry Smith 5549b92b1d3SBarry SmithWhen writing a Window's console application you do not need to do anything, the `stdout` 5559b92b1d3SBarry Smithand `stderr` is automatically output to the console window. 5569b92b1d3SBarry Smith::: 5579b92b1d3SBarry Smith 5589b92b1d3SBarry SmithTo change where all PETSc `stdout` and `stderr` go, (you can also reassign 5599b92b1d3SBarry Smith`PetscVFPrintf()` to handle `stdout` and `stderr` any way you like) write the 5609b92b1d3SBarry Smithfollowing function: 5619b92b1d3SBarry Smith 5629b92b1d3SBarry Smith``` 5639b92b1d3SBarry SmithPetscErrorCode mypetscvfprintf(FILE *fd, const char format[], va_list Argp) 5649b92b1d3SBarry Smith{ 5659b92b1d3SBarry Smith PetscFunctionBegin; 5669b92b1d3SBarry Smith if (fd != stdout && fd != stderr) { /* handle regular files */ 5679b92b1d3SBarry Smith PetscCall(PetscVFPrintfDefault(fd, format, Argp)); 5689b92b1d3SBarry Smith } else { 5699b92b1d3SBarry Smith char buff[1024]; /* Make sure to assign a large enough buffer */ 5709b92b1d3SBarry Smith int length; 5719b92b1d3SBarry Smith 5729b92b1d3SBarry Smith PetscCall(PetscVSNPrintf(buff, 1024, format, &length, Argp)); 5739b92b1d3SBarry Smith 5749b92b1d3SBarry Smith /* now send buff to whatever stream or whatever you want */ 5759b92b1d3SBarry Smith } 5769b92b1d3SBarry Smith PetscFunctionReturn(PETSC_SUCCESS); 5779b92b1d3SBarry Smith} 5789b92b1d3SBarry Smith``` 5799b92b1d3SBarry Smith 5809b92b1d3SBarry SmithThen assign `PetscVFPrintf = mypetscprintf` before `PetscInitialize()` in your main 5819b92b1d3SBarry Smithprogram. 5829b92b1d3SBarry Smith 5839b92b1d3SBarry Smith### I want to use Hypre boomerAMG without GMRES but when I run `-pc_type hypre -pc_hypre_type boomeramg -ksp_type preonly` I don't get a very accurate answer! 5849b92b1d3SBarry Smith 5859b92b1d3SBarry SmithYou should run with `-ksp_type richardson` to have PETSc run several V or W 5869b92b1d3SBarry Smithcycles. `-ksp_type preonly` causes boomerAMG to use only one V/W cycle. You can control 5879b92b1d3SBarry Smithhow many cycles are used in a single application of the boomerAMG preconditioner with 5889b92b1d3SBarry Smith`-pc_hypre_boomeramg_max_iter <it>` (the default is 1). You can also control the 5899b92b1d3SBarry Smithtolerance boomerAMG uses to decide if to stop before `max_iter` with 5909b92b1d3SBarry Smith`-pc_hypre_boomeramg_tol <tol>` (the default is 1.e-7). Run with `-ksp_view` to see 5919b92b1d3SBarry Smithall the hypre options used and `-help | grep boomeramg` to see all the command line 5929b92b1d3SBarry Smithoptions. 5939b92b1d3SBarry Smith 5949b92b1d3SBarry Smith### How do I use PETSc for Domain Decomposition? 5959b92b1d3SBarry Smith 5969b92b1d3SBarry SmithPETSc includes Additive Schwarz methods in the suite of preconditioners under the umbrella 5979b92b1d3SBarry Smithof `PCASM`. These may be activated with the runtime option `-pc_type asm`. Various 5989b92b1d3SBarry Smithother options may be set, including the degree of overlap `-pc_asm_overlap <number>` the 5999b92b1d3SBarry Smithtype of restriction/extension `-pc_asm_type [basic,restrict,interpolate,none]` sets ASM 6009b92b1d3SBarry Smithtype and several others. You may see the available ASM options by using `-pc_type asm 6019b92b1d3SBarry Smith-help`. See the procedural interfaces in the manual pages, for example `PCASMType()` 6029b92b1d3SBarry Smithand check the index of the users manual for `PCASMCreateSubdomains()`. 6039b92b1d3SBarry Smith 6049b92b1d3SBarry SmithPETSc also contains a domain decomposition inspired wirebasket or face based two level 6059b92b1d3SBarry Smithmethod where the coarse mesh to fine mesh interpolation is defined by solving specific 6069b92b1d3SBarry Smithlocal subdomain problems. It currently only works for 3D scalar problems on structured 6079b92b1d3SBarry Smithgrids created with PETSc `DMDA`. See the manual page for `PCEXOTIC` and 6089b92b1d3SBarry Smith`src/ksp/ksp/tutorials/ex45.c` for an example. 6099b92b1d3SBarry Smith 6109b92b1d3SBarry SmithPETSc also contains a balancing Neumann-Neumann type preconditioner, see the manual page 6119b92b1d3SBarry Smithfor `PCBDDC`. This requires matrices be constructed with `MatCreateIS()` via the finite 6129b92b1d3SBarry Smithelement method. See `src/ksp/ksp/tests/ex59.c` for an example. 6139b92b1d3SBarry Smith 6149b92b1d3SBarry Smith### You have `MATAIJ` and `MATBAIJ` matrix formats, and `MATSBAIJ` for symmetric storage, how come no `MATSAIJ`? 6159b92b1d3SBarry Smith 6169b92b1d3SBarry SmithJust for historical reasons; the `MATSBAIJ` format with blocksize one is just as efficient as 6179b92b1d3SBarry Smitha `MATSAIJ` would be. 6189b92b1d3SBarry Smith 6199b92b1d3SBarry Smith### Can I Create BAIJ matrices with different size blocks for different block rows? 6209b92b1d3SBarry Smith 6219b92b1d3SBarry SmithNo. The `MATBAIJ` format only supports a single fixed block size on the entire 6229b92b1d3SBarry Smithmatrix. But the `MATAIJ` format automatically searches for matching rows and thus still 6239b92b1d3SBarry Smithtakes advantage of the natural blocks in your matrix to obtain good performance. 6249b92b1d3SBarry Smith 6259b92b1d3SBarry Smith:::{note} 6269b92b1d3SBarry SmithIf you use `MATAIJ` you cannot use the `MatSetValuesBlocked()`. 6279b92b1d3SBarry Smith::: 6289b92b1d3SBarry Smith 6299b92b1d3SBarry Smith### How do I access the values of a remote parallel PETSc Vec? 6309b92b1d3SBarry Smith 6319b92b1d3SBarry Smith1. On each process create a local `Vec` large enough to hold all the values it wishes to 6329b92b1d3SBarry Smith access. 6339b92b1d3SBarry Smith2. Create a `VecScatter` that scatters from the parallel `Vec` into the local `Vec`. 6349b92b1d3SBarry Smith3. Use `VecGetArray()` to access the values in the local `Vec`. 6359b92b1d3SBarry Smith 6369b92b1d3SBarry SmithFor example, assuming we have distributed a vector `vecGlobal` of size $N$ to 6379b92b1d3SBarry Smith$R$ ranks and each remote rank holds $N/R = m$ values (similarly assume that 6389b92b1d3SBarry Smith$N$ is cleanly divisible by $R$). We want each rank $r$ to gather the 6399b92b1d3SBarry Smithfirst $n$ (also assume $n \leq m$) values from its immediately superior neighbor 6409b92b1d3SBarry Smith$r+1$ (final rank will retrieve from rank 0). 6419b92b1d3SBarry Smith 6429b92b1d3SBarry Smith``` 6439b92b1d3SBarry SmithVec vecLocal; 6449b92b1d3SBarry SmithIS isLocal, isGlobal; 6459b92b1d3SBarry SmithVecScatter ctx; 6469b92b1d3SBarry SmithPetscScalar *arr; 6479b92b1d3SBarry SmithPetscInt N, firstGlobalIndex; 6489b92b1d3SBarry SmithMPI_Comm comm; 6499b92b1d3SBarry SmithPetscMPIInt r, R; 6509b92b1d3SBarry Smith 6519b92b1d3SBarry Smith/* Create sequential local vector, big enough to hold local portion */ 6529b92b1d3SBarry SmithPetscCall(VecCreateSeq(PETSC_COMM_SELF, n, &vecLocal)); 6539b92b1d3SBarry Smith 6549b92b1d3SBarry Smith/* Create IS to describe where we want to scatter to */ 6559b92b1d3SBarry SmithPetscCall(ISCreateStride(PETSC_COMM_SELF, n, 0, 1, &isLocal)); 6569b92b1d3SBarry Smith 6579b92b1d3SBarry Smith/* Compute the global indices */ 6589b92b1d3SBarry SmithPetscCall(VecGetSize(vecGlobal, &N)); 6599b92b1d3SBarry SmithPetscCall(PetscObjectGetComm((PetscObject) vecGlobal, &comm)); 6609b92b1d3SBarry SmithPetscCallMPI(MPI_Comm_rank(comm, &r)); 6619b92b1d3SBarry SmithPetscCallMPI(MPI_Comm_size(comm, &R)); 6629b92b1d3SBarry SmithfirstGlobalIndex = r == R-1 ? 0 : (N/R)*(r+1); 6639b92b1d3SBarry Smith 6649b92b1d3SBarry Smith/* Create IS that describes where we want to scatter from */ 6659b92b1d3SBarry SmithPetscCall(ISCreateStride(comm, n, firstGlobalIndex, 1, &isGlobal)); 6669b92b1d3SBarry Smith 6679b92b1d3SBarry Smith/* Create the VecScatter context */ 6689b92b1d3SBarry SmithPetscCall(VecScatterCreate(vecGlobal, isGlobal, vecLocal, isLocal, &ctx)); 6699b92b1d3SBarry Smith 6709b92b1d3SBarry Smith/* Gather the values */ 6719b92b1d3SBarry SmithPetscCall(VecScatterBegin(ctx, vecGlobal, vecLocal, INSERT_VALUES, SCATTER_FORWARD)); 6729b92b1d3SBarry SmithPetscCall(VecScatterEnd(ctx, vecGlobal, vecLocal, INSERT_VALUES, SCATTER_FORWARD)); 6739b92b1d3SBarry Smith 6749b92b1d3SBarry Smith/* Retrieve and do work */ 6759b92b1d3SBarry SmithPetscCall(VecGetArray(vecLocal, &arr)); 6769b92b1d3SBarry Smith/* Work */ 6779b92b1d3SBarry SmithPetscCall(VecRestoreArray(vecLocal, &arr)); 6789b92b1d3SBarry Smith 6799b92b1d3SBarry Smith/* Don't forget to clean up */ 6809b92b1d3SBarry SmithPetscCall(ISDestroy(&isLocal)); 6819b92b1d3SBarry SmithPetscCall(ISDestroy(&isGlobal)); 6829b92b1d3SBarry SmithPetscCall(VecScatterDestroy(&ctx)); 6839b92b1d3SBarry SmithPetscCall(VecDestroy(&vecLocal)); 6849b92b1d3SBarry Smith``` 6859b92b1d3SBarry Smith 6869b92b1d3SBarry Smith(doc_faq_usage_alltoone)= 6879b92b1d3SBarry Smith 6889b92b1d3SBarry Smith### How do I collect to a single processor all the values from a parallel PETSc Vec? 6899b92b1d3SBarry Smith 6909b92b1d3SBarry Smith1. Create the `VecScatter` context that will do the communication: 6919b92b1d3SBarry Smith 6929b92b1d3SBarry Smith ``` 6939b92b1d3SBarry Smith Vec in_par,out_seq; 6949b92b1d3SBarry Smith VecScatter ctx; 6959b92b1d3SBarry Smith 6969b92b1d3SBarry Smith PetscCall(VecScatterCreateToAll(in_par, &ctx, &out_seq)); 6979b92b1d3SBarry Smith ``` 6989b92b1d3SBarry Smith 6999b92b1d3SBarry Smith2. Initiate the communication (this may be repeated if you wish): 7009b92b1d3SBarry Smith 7019b92b1d3SBarry Smith ``` 7029b92b1d3SBarry Smith PetscCall(VecScatterBegin(ctx, in_par, out_seq, INSERT_VALUES, SCATTER_FORWARD)); 7039b92b1d3SBarry Smith PetscCall(VecScatterEnd(ctx, in_par, out_seq, INSERT_VALUES, SCATTER_FORWARD)); 7049b92b1d3SBarry Smith /* May destroy context now if no additional scatters are needed, otherwise reuse it */ 7059b92b1d3SBarry Smith PetscCall(VecScatterDestroy(&ctx)); 7069b92b1d3SBarry Smith ``` 7079b92b1d3SBarry Smith 7089b92b1d3SBarry SmithNote that this simply concatenates in the parallel ordering of the vector (computed by the 7099b92b1d3SBarry Smith`MPI_Comm_rank` of the owning process). If you are using a `Vec` from 7109b92b1d3SBarry Smith`DMCreateGlobalVector()` you likely want to first call `DMDAGlobalToNaturalBegin()` 7119b92b1d3SBarry Smithfollowed by `DMDAGlobalToNaturalEnd()` to scatter the original `Vec` into the natural 7129b92b1d3SBarry Smithordering in a new global `Vec` before calling `VecScatterBegin()`/`VecScatterEnd()` 7139b92b1d3SBarry Smithto scatter the natural `Vec` onto all processes. 7149b92b1d3SBarry Smith 7159b92b1d3SBarry Smith### How do I collect all the values from a parallel PETSc Vec on the 0th rank? 7169b92b1d3SBarry Smith 7179b92b1d3SBarry SmithSee FAQ entry on collecting to {ref}`an arbitrary processor <doc_faq_usage_alltoone>`, but 7189b92b1d3SBarry Smithreplace 7199b92b1d3SBarry Smith 7209b92b1d3SBarry Smith``` 7219b92b1d3SBarry SmithPetscCall(VecScatterCreateToAll(in_par, &ctx, &out_seq)); 7229b92b1d3SBarry Smith``` 7239b92b1d3SBarry Smith 7249b92b1d3SBarry Smithwith 7259b92b1d3SBarry Smith 7269b92b1d3SBarry Smith``` 7279b92b1d3SBarry SmithPetscCall(VecScatterCreateToZero(in_par, &ctx, &out_seq)); 7289b92b1d3SBarry Smith``` 7299b92b1d3SBarry Smith 7309b92b1d3SBarry Smith:::{note} 7319b92b1d3SBarry SmithThe same ordering considerations as discussed in the aforementioned entry also apply 7329b92b1d3SBarry Smithhere. 7339b92b1d3SBarry Smith::: 7349b92b1d3SBarry Smith 7359b92b1d3SBarry Smith### How can I read in or write out a sparse matrix in Matrix Market, Harwell-Boeing, Slapc or other ASCII format? 7369b92b1d3SBarry Smith 7379b92b1d3SBarry SmithIf you can read or write your matrix using Python or MATLAB/Octave, `PetscBinaryIO` 7389b92b1d3SBarry Smithmodules are provided at `$PETSC_DIR/lib/petsc/bin` for each language that can assist 7399b92b1d3SBarry Smithwith reading and writing. If you just want to convert `MatrixMarket`, you can use: 7409b92b1d3SBarry Smith 7419b92b1d3SBarry Smith```console 7429b92b1d3SBarry Smith$ python -m $PETSC_DIR/lib/petsc/bin/PetscBinaryIO convert matrix.mtx 7439b92b1d3SBarry Smith``` 7449b92b1d3SBarry Smith 7459b92b1d3SBarry SmithTo produce `matrix.petsc`. 7469b92b1d3SBarry Smith 7479b92b1d3SBarry SmithYou can also call the script directly or import it from your Python code. There is also a 7489b92b1d3SBarry Smith`PETScBinaryIO.jl` Julia package. 7499b92b1d3SBarry Smith 7509b92b1d3SBarry SmithFor other formats, either adapt one of the above libraries or see the examples in 7519b92b1d3SBarry Smith`$PETSC_DIR/src/mat/tests`, specifically `ex72.c` or `ex78.c`. You will likely need 7529b92b1d3SBarry Smithto modify the code slightly to match your required ASCII format. 7539b92b1d3SBarry Smith 7549b92b1d3SBarry Smith:::{note} 7559b92b1d3SBarry SmithNever read or write in parallel an ASCII matrix file. 7569b92b1d3SBarry Smith 7579b92b1d3SBarry SmithInstead read in sequentially with a standalone code based on `ex72.c` or `ex78.c` 7589b92b1d3SBarry Smiththen save the matrix with the binary viewer `PetscViewerBinaryOpen()` and load the 7599b92b1d3SBarry Smithmatrix in parallel in your "real" PETSc program with `MatLoad()`. 7609b92b1d3SBarry Smith 7619b92b1d3SBarry SmithFor writing save with the binary viewer and then load with the sequential code to store 7629b92b1d3SBarry Smithit as ASCII. 7639b92b1d3SBarry Smith::: 7649b92b1d3SBarry Smith 7659b92b1d3SBarry Smith### Does TSSetFromOptions(), SNESSetFromOptions(), or KSPSetFromOptions() reset all the parameters I previously set or how come do they not seem to work? 7669b92b1d3SBarry Smith 7679b92b1d3SBarry SmithIf `XXSetFromOptions()` is used (with `-xxx_type aaaa`) to change the type of the 7689b92b1d3SBarry Smithobject then all parameters associated with the previous type are removed. Otherwise it 7699b92b1d3SBarry Smithdoes not reset parameters. 7709b92b1d3SBarry Smith 7719b92b1d3SBarry Smith`TS/SNES/KSPSetXXX()` commands that set properties for a particular type of object (such 7729b92b1d3SBarry Smithas `KSPGMRESSetRestart()`) ONLY work if the object is ALREADY of that type. For example, 7739b92b1d3SBarry Smithwith 7749b92b1d3SBarry Smith 7759b92b1d3SBarry Smith``` 7769b92b1d3SBarry SmithKSP ksp; 7779b92b1d3SBarry Smith 7789b92b1d3SBarry SmithPetscCall(KSPCreate(PETSC_COMM_WORLD, &ksp)); 7799b92b1d3SBarry SmithPetscCall(KSPGMRESSetRestart(ksp, 10)); 7809b92b1d3SBarry Smith``` 7819b92b1d3SBarry Smith 7829b92b1d3SBarry Smiththe restart will be ignored since the type has not yet been set to `KSPGMRES`. To have 7839b92b1d3SBarry Smiththose values take effect you should do one of the following: 7849b92b1d3SBarry Smith 7859b92b1d3SBarry Smith- Allow setting the type from the command line, if it is not on the command line then the 7869b92b1d3SBarry Smith default type is automatically set. 7879b92b1d3SBarry Smith 7889b92b1d3SBarry Smith``` 7899b92b1d3SBarry Smith/* Create generic object */ 7909b92b1d3SBarry SmithXXXCreate(..,&obj); 7919b92b1d3SBarry Smith/* Must set all settings here, or default */ 7929b92b1d3SBarry SmithXXXSetFromOptions(obj); 7939b92b1d3SBarry Smith``` 7949b92b1d3SBarry Smith 7959b92b1d3SBarry Smith- Hardwire the type in the code, but allow the user to override it via a subsequent 7969b92b1d3SBarry Smith `XXXSetFromOptions()` call. This essentially allows the user to customize what the 7979b92b1d3SBarry Smith "default" type to of the object. 7989b92b1d3SBarry Smith 7999b92b1d3SBarry Smith``` 8009b92b1d3SBarry Smith/* Create generic object */ 8019b92b1d3SBarry SmithXXXCreate(..,&obj); 8029b92b1d3SBarry Smith/* Set type directly */ 8039b92b1d3SBarry SmithXXXSetYYYYY(obj,...); 8049b92b1d3SBarry Smith/* Can always change to different type */ 8059b92b1d3SBarry SmithXXXSetFromOptions(obj); 8069b92b1d3SBarry Smith``` 8079b92b1d3SBarry Smith 8089b92b1d3SBarry Smith### How do I compile and link my own PETSc application codes and can I use my own `makefile` or rules for compiling code, rather than PETSc's? 8099b92b1d3SBarry Smith 8109b92b1d3SBarry SmithSee the {ref}`section <sec_writing_application_codes>` of the users manual on writing 8119b92b1d3SBarry Smithapplication codes with PETSc. 8129b92b1d3SBarry Smith 8139b92b1d3SBarry Smith### Can I use CMake to build my own project that depends on PETSc? 8149b92b1d3SBarry Smith 8159b92b1d3SBarry SmithSee the {ref}`section <sec_writing_application_codes>` of the users manual on writing 8169b92b1d3SBarry Smithapplication codes with PETSc. 8179b92b1d3SBarry Smith 8189b92b1d3SBarry Smith### How can I put carriage returns in `PetscPrintf()` statements from Fortran? 8199b92b1d3SBarry Smith 8209b92b1d3SBarry SmithYou can use the same notation as in C, just put a `\n` in the string. Note that no other C 8219b92b1d3SBarry Smithformat instruction is supported. 8229b92b1d3SBarry Smith 8239b92b1d3SBarry SmithOr you can use the Fortran concatenation `//` and `char(10)`; for example `'some 8249b92b1d3SBarry Smithstring'//char(10)//'another string` on the next line. 8259b92b1d3SBarry Smith 8269b92b1d3SBarry Smith### How can I implement callbacks using C++ class methods? 8279b92b1d3SBarry Smith 8289b92b1d3SBarry SmithDeclare the class method static. Static methods do not have a `this` pointer, but the 8299b92b1d3SBarry Smith`void*` context parameter will usually be cast to a pointer to the class where it can 8309b92b1d3SBarry Smithserve the same function. 8319b92b1d3SBarry Smith 8329b92b1d3SBarry Smith:::{admonition} Remember 8339b92b1d3SBarry SmithAll PETSc callbacks return `PetscErrorCode`. 8349b92b1d3SBarry Smith::: 8359b92b1d3SBarry Smith 8369b92b1d3SBarry Smith### Everyone knows that when you code Newton's Method you should compute the function and its Jacobian at the same time. How can one do this in PETSc? 8379b92b1d3SBarry Smith 8389b92b1d3SBarry SmithThe update in Newton's method is computed as 8399b92b1d3SBarry Smith 8409b92b1d3SBarry Smith$$ 8419b92b1d3SBarry Smithu^{n+1} = u^n - \lambda * \left[J(u^n)] * F(u^n) \right]^{\dagger} 8429b92b1d3SBarry Smith$$ 8439b92b1d3SBarry Smith 8449b92b1d3SBarry SmithThe reason PETSc doesn't default to computing both the function and Jacobian at the same 8459b92b1d3SBarry Smithtime is: 8469b92b1d3SBarry Smith 8479b92b1d3SBarry Smith- In order to do the line search $F \left(u^n - \lambda * \text{step} \right)$ may 8489b92b1d3SBarry Smith need to be computed for several $\lambda$. The Jacobian is not needed for each of 8499b92b1d3SBarry Smith those and one does not know in advance which will be the final $\lambda$ until 8509b92b1d3SBarry Smith after the function value is computed, so many extra Jacobians may be computed. 8519b92b1d3SBarry Smith- In the final step if $|| F(u^p)||$ satisfies the convergence criteria then a 8529b92b1d3SBarry Smith Jacobian need not be computed. 8539b92b1d3SBarry Smith 8549b92b1d3SBarry SmithYou are free to have your `FormFunction()` compute as much of the Jacobian at that point 8559b92b1d3SBarry Smithas you like, keep the information in the user context (the final argument to 8569b92b1d3SBarry Smith`FormFunction()` and `FormJacobian()`) and then retrieve the information in your 8579b92b1d3SBarry Smith`FormJacobian()` function. 8589b92b1d3SBarry Smith 8599b92b1d3SBarry Smith### Computing the Jacobian or preconditioner is time consuming. Is there any way to compute it less often? 8609b92b1d3SBarry Smith 8619b92b1d3SBarry SmithPETSc has a variety of ways of lagging the computation of the Jacobian or the 8629b92b1d3SBarry Smithpreconditioner. They are documented in the manual page for `SNESComputeJacobian()` 8639b92b1d3SBarry Smithand in the {ref}`users manual <ch_snes>`: 8649b92b1d3SBarry Smith 8659b92b1d3SBarry Smith-s 8669b92b1d3SBarry Smith 8679b92b1d3SBarry Smithnes_lag_jacobian 8689b92b1d3SBarry Smith 8699b92b1d3SBarry Smith(`SNESSetLagJacobian()`) How often Jacobian is rebuilt (use -1 to 8709b92b1d3SBarry Smithnever rebuild, use -2 to rebuild the next time requested and then 8719b92b1d3SBarry Smithnever again). 8729b92b1d3SBarry Smith 8739b92b1d3SBarry Smith-s 8749b92b1d3SBarry Smith 8759b92b1d3SBarry Smithnes_lag_jacobian_persists 8769b92b1d3SBarry Smith 8779b92b1d3SBarry Smith(`SNESSetLagJacobianPersists()`) Forces lagging of Jacobian 8789b92b1d3SBarry Smiththrough multiple `SNES` solves , same as passing -2 to 8799b92b1d3SBarry Smith`-snes_lag_jacobian`. By default, each new `SNES` solve 8809b92b1d3SBarry Smithnormally triggers a recomputation of the Jacobian. 8819b92b1d3SBarry Smith 8829b92b1d3SBarry Smith-s 8839b92b1d3SBarry Smith 8849b92b1d3SBarry Smithnes_lag_preconditioner 8859b92b1d3SBarry Smith 8869b92b1d3SBarry Smith(`SNESSetLagPreconditioner()`) how often the preconditioner is 8879b92b1d3SBarry Smithrebuilt. Note: if you are lagging the Jacobian the system will 8889b92b1d3SBarry Smithknow that the matrix has not changed and will not recompute the 8899b92b1d3SBarry Smith(same) preconditioner. 8909b92b1d3SBarry Smith 8919b92b1d3SBarry Smith-s 8929b92b1d3SBarry Smith 8939b92b1d3SBarry Smithnes_lag_preconditioner_persists 8949b92b1d3SBarry Smith 8959b92b1d3SBarry Smith(`SNESSetLagPreconditionerPersists()`) Preconditioner 8969b92b1d3SBarry Smithlags through multiple `SNES` solves 8979b92b1d3SBarry Smith 8989b92b1d3SBarry Smith:::{note} 8999b92b1d3SBarry SmithThese are often (but does not need to be) used in combination with 9009b92b1d3SBarry Smith`-snes_mf_operator` which applies the fresh Jacobian matrix-free for every 9019b92b1d3SBarry Smithmatrix-vector product. Otherwise the out-of-date matrix vector product, computed with 9029b92b1d3SBarry Smiththe lagged Jacobian will be used. 9039b92b1d3SBarry Smith::: 9049b92b1d3SBarry Smith 9059b92b1d3SBarry SmithBy using `KSPMonitorSet()` and/or `SNESMonitorSet()` one can provide code that monitors the 9069b92b1d3SBarry Smithconvergence rate and automatically triggers an update of the Jacobian or preconditioner 9079b92b1d3SBarry Smithbased on decreasing convergence of the iterative method. For example if the number of `SNES` 9089b92b1d3SBarry Smithiterations doubles one might trigger a new computation of the Jacobian. Experimentation is 9099b92b1d3SBarry Smiththe only general purpose way to determine which approach is best for your problem. 9109b92b1d3SBarry Smith 9119b92b1d3SBarry Smith:::{important} 9129b92b1d3SBarry SmithIt is also vital to experiment on your true problem at the scale you will be solving 9139b92b1d3SBarry Smiththe problem since the performance benefits depend on the exact problem and the problem 9149b92b1d3SBarry Smithsize! 9159b92b1d3SBarry Smith::: 9169b92b1d3SBarry Smith 9179b92b1d3SBarry Smith### How can I use Newton's Method Jacobian free? Can I difference a different function than provided with `SNESSetFunction()`? 9189b92b1d3SBarry Smith 9199b92b1d3SBarry SmithThe simplest way is with the option `-snes_mf`, this will use finite differencing of the 9209b92b1d3SBarry Smithfunction provided to `SNESComputeFunction()` to approximate the action of Jacobian. 9219b92b1d3SBarry Smith 9229b92b1d3SBarry Smith:::{important} 9239b92b1d3SBarry SmithSince no matrix-representation of the Jacobian is provided the `-pc_type` used with 9249b92b1d3SBarry Smiththis option must be `-pc_type none`. You may provide a custom preconditioner with 9259b92b1d3SBarry Smith`SNESGetKSP()`, `KSPGetPC()`, and `PCSetType()` and use `PCSHELL`. 9269b92b1d3SBarry Smith::: 9279b92b1d3SBarry Smith 9289b92b1d3SBarry SmithThe option `-snes_mf_operator` will use Jacobian free to apply the Jacobian (in the 9299b92b1d3SBarry SmithKrylov solvers) but will use whatever matrix you provided with `SNESSetJacobian()` 9309b92b1d3SBarry Smith(assuming you set one) to compute the preconditioner. 9319b92b1d3SBarry Smith 9329b92b1d3SBarry SmithTo write the code (rather than use the options above) use `MatCreateSNESMF()` and pass 9339b92b1d3SBarry Smiththe resulting matrix object to `SNESSetJacobian()`. 9349b92b1d3SBarry Smith 9359b92b1d3SBarry SmithFor purely matrix-free (like `-snes_mf`) pass the matrix object for both matrix 9369b92b1d3SBarry Smitharguments and pass the function `MatMFFDComputeJacobian()`. 9379b92b1d3SBarry Smith 9389b92b1d3SBarry SmithTo provide your own approximate Jacobian matrix to compute the preconditioner (like 9399b92b1d3SBarry Smith`-snes_mf_operator`), pass this other matrix as the second matrix argument to 9409b92b1d3SBarry Smith`SNESSetJacobian()`. Make sure your provided `computejacobian()` function calls 9419b92b1d3SBarry Smith`MatAssemblyBegin()` and `MatAssemblyEnd()` separately on **BOTH** matrix arguments 9429b92b1d3SBarry Smithto this function. See `src/snes/tests/ex7.c`. 9439b92b1d3SBarry Smith 9449b92b1d3SBarry SmithTo difference a different function than that passed to `SNESSetJacobian()` to compute the 9459b92b1d3SBarry Smithmatrix-free Jacobian multiply call `MatMFFDSetFunction()` to set that other function. See 9469b92b1d3SBarry Smith`src/snes/tests/ex7.c.h`. 9479b92b1d3SBarry Smith 9489b92b1d3SBarry Smith(doc_faq_usage_condnum)= 9499b92b1d3SBarry Smith 9509b92b1d3SBarry Smith### How can I determine the condition number of a matrix? 9519b92b1d3SBarry Smith 9529b92b1d3SBarry SmithFor small matrices, the condition number can be reliably computed using 9539b92b1d3SBarry Smith 9549b92b1d3SBarry Smith```text 9559b92b1d3SBarry Smith-pc_type svd -pc_svd_monitor 9569b92b1d3SBarry Smith``` 9579b92b1d3SBarry Smith 9589b92b1d3SBarry SmithFor larger matrices, you can run with 9599b92b1d3SBarry Smith 9609b92b1d3SBarry Smith```text 9619b92b1d3SBarry Smith-pc_type none -ksp_type gmres -ksp_monitor_singular_value -ksp_gmres_restart 1000 9629b92b1d3SBarry Smith``` 9639b92b1d3SBarry Smith 9649b92b1d3SBarry Smithto get approximations to the condition number of the operator. This will generally be 9659b92b1d3SBarry Smithaccurate for the largest singular values, but may overestimate the smallest singular value 9669b92b1d3SBarry Smithunless the method has converged. Make sure to avoid restarts. To estimate the condition 9679b92b1d3SBarry Smithnumber of the preconditioned operator, use `-pc_type somepc` in the last command. 9689b92b1d3SBarry Smith 9699b92b1d3SBarry SmithYou can use [SLEPc](https://slepc.upv.es) for highly scalable, efficient, and quality eigenvalue computations. 9709b92b1d3SBarry Smith 9719b92b1d3SBarry Smith### How can I compute the inverse of a matrix in PETSc? 9729b92b1d3SBarry Smith 9739b92b1d3SBarry Smith:::{admonition} Are you sure? 9749b92b1d3SBarry Smith:class: yellow 9759b92b1d3SBarry Smith 9769b92b1d3SBarry SmithIt is very expensive to compute the inverse of a matrix and very rarely needed in 9779b92b1d3SBarry Smithpractice. We highly recommend avoiding algorithms that need it. 9789b92b1d3SBarry Smith::: 9799b92b1d3SBarry Smith 9809b92b1d3SBarry SmithThe inverse of a matrix (dense or sparse) is essentially always dense, so begin by 9819b92b1d3SBarry Smithcreating a dense matrix B and fill it with the identity matrix (ones along the diagonal), 9829b92b1d3SBarry Smithalso create a dense matrix X of the same size that will hold the solution. Then factor the 9839b92b1d3SBarry Smithmatrix you wish to invert with `MatLUFactor()` or `MatCholeskyFactor()`, call the 9849b92b1d3SBarry Smithresult A. Then call `MatMatSolve(A,B,X)` to compute the inverse into X. See also section 9859b92b1d3SBarry Smithon {any}`Schur's complement <how_can_i_compute_the_schur_complement>`. 9869b92b1d3SBarry Smith 9879b92b1d3SBarry Smith(how_can_i_compute_the_schur_complement)= 9889b92b1d3SBarry Smith 9899b92b1d3SBarry Smith### How can I compute the Schur complement in PETSc? 9909b92b1d3SBarry Smith 9919b92b1d3SBarry Smith:::{admonition} Are you sure? 9929b92b1d3SBarry Smith:class: yellow 9939b92b1d3SBarry Smith 9949b92b1d3SBarry SmithIt is very expensive to compute the Schur complement of a matrix and very rarely needed 9959b92b1d3SBarry Smithin practice. We highly recommend avoiding algorithms that need it. 9969b92b1d3SBarry Smith::: 9979b92b1d3SBarry Smith 9989b92b1d3SBarry SmithThe Schur complement of the matrix $M \in \mathbb{R}^{\left(p+q \right) \times 9999b92b1d3SBarry Smith\left(p + q \right)}$ 10009b92b1d3SBarry Smith 10019b92b1d3SBarry Smith$$ 10029b92b1d3SBarry SmithM = \begin{bmatrix} 10039b92b1d3SBarry SmithA & B \\ 10049b92b1d3SBarry SmithC & D 10059b92b1d3SBarry Smith\end{bmatrix} 10069b92b1d3SBarry Smith$$ 10079b92b1d3SBarry Smith 10089b92b1d3SBarry Smithwhere 10099b92b1d3SBarry Smith 10109b92b1d3SBarry Smith$$ 10119b92b1d3SBarry SmithA \in \mathbb{R}^{p \times p}, \quad B \in \mathbb{R}^{p \times q}, \quad C \in \mathbb{R}^{q \times p}, \quad D \in \mathbb{R}^{q \times q} 10129b92b1d3SBarry Smith$$ 10139b92b1d3SBarry Smith 10149b92b1d3SBarry Smithis given by 10159b92b1d3SBarry Smith 10169b92b1d3SBarry Smith$$ 10179b92b1d3SBarry SmithS_D := A - BD^{-1}C \\ 10189b92b1d3SBarry SmithS_A := D - CA^{-1}B 10199b92b1d3SBarry Smith$$ 10209b92b1d3SBarry Smith 10219b92b1d3SBarry SmithLike the inverse, the Schur complement of a matrix (dense or sparse) is essentially always 10229b92b1d3SBarry Smithdense, so assuming you wish to calculate $S_A = D - C \underbrace{ 10239b92b1d3SBarry Smith\overbrace{(A^{-1})}^{U} B}_{V}$ begin by: 10249b92b1d3SBarry Smith 10259b92b1d3SBarry Smith1. Forming a dense matrix $B$ 10269b92b1d3SBarry Smith2. Also create another dense matrix $V$ of the same size. 10279b92b1d3SBarry Smith3. Then either factor the matrix $A$ directly with `MatLUFactor()` or 10289b92b1d3SBarry Smith `MatCholeskyFactor()`, or use `MatGetFactor()` followed by 10299b92b1d3SBarry Smith `MatLUFactorSymbolic()` followed by `MatLUFactorNumeric()` if you wish to use and 10309b92b1d3SBarry Smith external solver package like SuperLU_Dist. Call the result $U$. 10319b92b1d3SBarry Smith4. Then call `MatMatSolve(U,B,V)`. 10329b92b1d3SBarry Smith5. Then call `MatMatMult(C,V,MAT_INITIAL_MATRIX,1.0,&S)`. 10339b92b1d3SBarry Smith6. Now call `MatAXPY(S,-1.0,D,MAT_SUBSET_NONZERO)`. 10349b92b1d3SBarry Smith7. Followed by `MatScale(S,-1.0)`. 10359b92b1d3SBarry Smith 10369b92b1d3SBarry SmithFor computing Schur complements like this it does not make sense to use the `KSP` 10379b92b1d3SBarry Smithiterative solvers since for solving many moderate size problems using a direct 10389b92b1d3SBarry Smithfactorization is much faster than iterative solvers. As you can see, this requires a great 10399b92b1d3SBarry Smithdeal of work space and computation so is best avoided. 10409b92b1d3SBarry Smith 10419b92b1d3SBarry SmithHowever, it is not necessary to assemble the Schur complement $S$ in order to solve 10429b92b1d3SBarry Smithsystems with it. Use `MatCreateSchurComplement(A,A_pre,B,C,D,&S)` to create a 10439b92b1d3SBarry Smithmatrix that applies the action of $S$ (using `A_pre` to solve with `A`), but 10449b92b1d3SBarry Smithdoes not assemble. 10459b92b1d3SBarry Smith 10469b92b1d3SBarry SmithAlternatively, if you already have a block matrix `M = [A, B; C, D]` (in some 10479b92b1d3SBarry Smithordering), then you can create index sets (`IS`) `isa` and `isb` to address each 10489b92b1d3SBarry Smithblock, then use `MatGetSchurComplement()` to create the Schur complement and/or an 10499b92b1d3SBarry Smithapproximation suitable for preconditioning. 10509b92b1d3SBarry Smith 10519b92b1d3SBarry SmithSince $S$ is generally dense, standard preconditioning methods cannot typically be 10529b92b1d3SBarry Smithapplied directly to Schur complements. There are many approaches to preconditioning Schur 10539b92b1d3SBarry Smithcomplements including using the `SIMPLE` approximation 10549b92b1d3SBarry Smith 10559b92b1d3SBarry Smith$$ 10569b92b1d3SBarry SmithD - C \text{diag}(A)^{-1} B 10579b92b1d3SBarry Smith$$ 10589b92b1d3SBarry Smith 10599b92b1d3SBarry Smithto create a sparse matrix that approximates the Schur complement (this is returned by 10609b92b1d3SBarry Smithdefault for the optional "preconditioning" matrix in `MatGetSchurComplement()`). 10619b92b1d3SBarry Smith 10629b92b1d3SBarry SmithAn alternative is to interpret the matrices as differential operators and apply 10639b92b1d3SBarry Smithapproximate commutator arguments to find a spectrally equivalent operation that can be 10649b92b1d3SBarry Smithapplied efficiently (see the "PCD" preconditioners {cite}`elman_silvester_wathen_2014`). A 10659b92b1d3SBarry Smithvariant of this is the least squares commutator, which is closely related to the 10669b92b1d3SBarry SmithMoore-Penrose pseudoinverse, and is available in `PCLSC` which operates on matrices of 10679b92b1d3SBarry Smithtype `MATSCHURCOMPLEMENT`. 10689b92b1d3SBarry Smith 10699b92b1d3SBarry Smith### Do you have examples of doing unstructured grid Finite Element Method (FEM) with PETSc? 10709b92b1d3SBarry Smith 10719b92b1d3SBarry SmithThere are at least three ways to write finite element codes using PETSc: 10729b92b1d3SBarry Smith 10739b92b1d3SBarry Smith1. Use `DMPLEX`, which is a high level approach to manage your mesh and 10749b92b1d3SBarry Smith discretization. See the {ref}`tutorials sections <tut_stokes>` for further information, 10759b92b1d3SBarry Smith or see `src/snes/tutorial/ex62.c`. 10769b92b1d3SBarry Smith2. Use packages such as [deal.ii](https://www.dealii.org), [libMesh](https://libmesh.github.io), or 10779b92b1d3SBarry Smith [Firedrake](https://www.firedrakeproject.org), which use PETSc for the solvers. 10789b92b1d3SBarry Smith3. Manage the grid data structure yourself and use PETSc `PetscSF`, `IS`, and `VecScatter` to 10799b92b1d3SBarry Smith communicate the required ghost point communication. See 10809b92b1d3SBarry Smith `src/snes/tutorials/ex10d/ex10.c`. 10819b92b1d3SBarry Smith 10829b92b1d3SBarry Smith### DMDA decomposes the domain differently than the MPI_Cart_create() command. How can one use them together? 10839b92b1d3SBarry Smith 10849b92b1d3SBarry SmithThe `MPI_Cart_create()` first divides the mesh along the z direction, then the y, then 10859b92b1d3SBarry Smiththe x. `DMDA` divides along the x, then y, then z. Thus, for example, rank 1 of the 10869b92b1d3SBarry Smithprocesses will be in a different part of the mesh for the two schemes. To resolve this you 10879b92b1d3SBarry Smithcan create a new MPI communicator that you pass to `DMDACreate()` that renumbers the 10889b92b1d3SBarry Smithprocess ranks so that each physical process shares the same part of the mesh with both the 10899b92b1d3SBarry Smith`DMDA` and the `MPI_Cart_create()`. The code to determine the new numbering was 10909b92b1d3SBarry Smithprovided by Rolf Kuiper: 10919b92b1d3SBarry Smith 10929b92b1d3SBarry Smith``` 10939b92b1d3SBarry Smith// the numbers of processors per direction are (int) x_procs, y_procs, z_procs respectively 10949b92b1d3SBarry Smith// (no parallelization in direction 'dir' means dir_procs = 1) 10959b92b1d3SBarry Smith 10969b92b1d3SBarry SmithMPI_Comm NewComm; 10979b92b1d3SBarry Smithint x, y, z; 10989b92b1d3SBarry SmithPetscMPIInt MPI_Rank, NewRank; 10999b92b1d3SBarry Smith 11009b92b1d3SBarry Smith// get rank from MPI ordering: 11019b92b1d3SBarry SmithPetscCallMPI(MPI_Comm_rank(MPI_COMM_WORLD, &MPI_Rank)); 11029b92b1d3SBarry Smith 11039b92b1d3SBarry Smith// calculate coordinates of cpus in MPI ordering: 11049b92b1d3SBarry Smithx = MPI_rank / (z_procs*y_procs); 11059b92b1d3SBarry Smithy = (MPI_rank % (z_procs*y_procs)) / z_procs; 11069b92b1d3SBarry Smithz = (MPI_rank % (z_procs*y_procs)) % z_procs; 11079b92b1d3SBarry Smith 11089b92b1d3SBarry Smith// set new rank according to PETSc ordering: 11099b92b1d3SBarry SmithNewRank = z*y_procs*x_procs + y*x_procs + x; 11109b92b1d3SBarry Smith 11119b92b1d3SBarry Smith// create communicator with new ranks according to PETSc ordering 11129b92b1d3SBarry SmithPetscCallMPI(MPI_Comm_split(PETSC_COMM_WORLD, 1, NewRank, &NewComm)); 11139b92b1d3SBarry Smith 11149b92b1d3SBarry Smith// override the default communicator (was MPI_COMM_WORLD as default) 11159b92b1d3SBarry SmithPETSC_COMM_WORLD = NewComm; 11169b92b1d3SBarry Smith``` 11179b92b1d3SBarry Smith 11189b92b1d3SBarry Smith### When solving a system with Dirichlet boundary conditions I can use MatZeroRows() to eliminate the Dirichlet rows but this results in a non-Symmetric system. How can I apply Dirichlet boundary conditions but keep the matrix symmetric? 11199b92b1d3SBarry Smith 11209b92b1d3SBarry Smith- For nonsymmetric systems put the appropriate boundary solutions in the x vector and use 11219b92b1d3SBarry Smith `MatZeroRows()` followed by `KSPSetOperators()`. 11229b92b1d3SBarry Smith- For symmetric problems use `MatZeroRowsColumns()`. 11239b92b1d3SBarry Smith- If you have many Dirichlet locations you can use `MatZeroRows()` (**not** 11249b92b1d3SBarry Smith `MatZeroRowsColumns()`) and `-ksp_type preonly -pc_type redistribute` (see 11259b92b1d3SBarry Smith `PCREDISTRIBUTE`) and PETSc will repartition the parallel matrix for load 11269b92b1d3SBarry Smith balancing. In this case the new matrix solved remains symmetric even though 11279b92b1d3SBarry Smith `MatZeroRows()` is used. 11289b92b1d3SBarry Smith 11299b92b1d3SBarry SmithAn alternative approach is, when assembling the matrix (generating values and passing 11309b92b1d3SBarry Smiththem to the matrix), never to include locations for the Dirichlet grid points in the 11319b92b1d3SBarry Smithvector and matrix, instead taking them into account as you put the other values into the 11329b92b1d3SBarry Smithload. 11339b92b1d3SBarry Smith 11349b92b1d3SBarry Smith### How can I get PETSc vectors and matrices to MATLAB or vice versa? 11359b92b1d3SBarry Smith 11369b92b1d3SBarry SmithThere are numerous ways to work with PETSc and MATLAB. All but the first approach 11379b92b1d3SBarry Smithrequire PETSc to be configured with --with-matlab. 11389b92b1d3SBarry Smith 11399b92b1d3SBarry Smith1. To save PETSc `Mat` and `Vec` to files that can be read from MATLAB use 11409b92b1d3SBarry Smith `PetscViewerBinaryOpen()` viewer and `VecView()` or `MatView()` to save objects 11419b92b1d3SBarry Smith for MATLAB and `VecLoad()` and `MatLoad()` to get the objects that MATLAB has 11429b92b1d3SBarry Smith saved. See `share/petsc/matlab/PetscBinaryRead.m` and 11439b92b1d3SBarry Smith `share/petsc/matlab/PetscBinaryWrite.m` for loading and saving the objects in MATLAB. 11449b92b1d3SBarry Smith2. Using the [MATLAB Engine](https://www.mathworks.com/help/matlab/calling-matlab-engine-from-c-programs-1.html), 11459b92b1d3SBarry Smith allows PETSc to automatically call MATLAB to perform some specific computations. This 11469b92b1d3SBarry Smith does not allow MATLAB to be used interactively by the user. See the 11479b92b1d3SBarry Smith `PetscMatlabEngine`. 11489b92b1d3SBarry Smith3. You can open a socket connection between MATLAB and PETSc to allow sending objects back 11499b92b1d3SBarry Smith and forth between an interactive MATLAB session and a running PETSc program. See 11509b92b1d3SBarry Smith `PetscViewerSocketOpen()` for access from the PETSc side and 11519b92b1d3SBarry Smith `share/petsc/matlab/PetscReadBinary.m` for access from the MATLAB side. 11529b92b1d3SBarry Smith4. You can save PETSc `Vec` (**not** `Mat`) with the `PetscViewerMatlabOpen()` 11539b92b1d3SBarry Smith viewer that saves `.mat` files can then be loaded into MATLAB using the `load()` command 11549b92b1d3SBarry Smith 11559b92b1d3SBarry Smith### How do I get started with Cython so that I can extend petsc4py? 11569b92b1d3SBarry Smith 11579b92b1d3SBarry Smith1. Learn how to [build a Cython module](http://docs.cython.org/src/quickstart/build.html). 11589b92b1d3SBarry Smith2. Go through the simple [example](https://stackoverflow.com/questions/3046305/simple-wrapping-of-c-code-with-cython). Note 11599b92b1d3SBarry Smith also the next comment that shows how to create numpy arrays in the Cython and pass them 11609b92b1d3SBarry Smith back. 11619b92b1d3SBarry Smith3. Check out [this page](http://docs.cython.org/src/tutorial/numpy.html) which tells 11629b92b1d3SBarry Smith you how to get fast indexing. 11639b92b1d3SBarry Smith4. Have a look at the petsc4py [array source](http://code.google.com/p/petsc4py/source/browse/src/PETSc/arraynpy.pxi). 11649b92b1d3SBarry Smith 11659b92b1d3SBarry Smith### How do I compute a custom norm for KSP to use as a convergence test or for monitoring? 11669b92b1d3SBarry Smith 11679b92b1d3SBarry SmithYou need to call `KSPBuildResidual()` on your `KSP` object and then compute the 11689b92b1d3SBarry Smithappropriate norm on the resulting residual. Note that depending on the 11699b92b1d3SBarry Smith`KSPSetNormType()` of the method you may not return the same norm as provided by the 11709b92b1d3SBarry Smithmethod. See also `KSPSetPCSide()`. 11719b92b1d3SBarry Smith 11729b92b1d3SBarry Smith### If I have a sequential program can I use a PETSc parallel solver? 11739b92b1d3SBarry Smith 11749b92b1d3SBarry Smith:::{important} 11759b92b1d3SBarry SmithDo not expect to get great speedups! Much of the speedup gained by using parallel 11769b92b1d3SBarry Smithsolvers comes from building the underlying matrices and vectors in parallel to begin 11779b92b1d3SBarry Smithwith. You should see some reduction in the time for the linear solvers. 11789b92b1d3SBarry Smith::: 11799b92b1d3SBarry Smith 11809b92b1d3SBarry SmithYes, you must set up PETSc with MPI (even though you will not use MPI) with at least the 11819b92b1d3SBarry Smithfollowing options: 11829b92b1d3SBarry Smith 11839b92b1d3SBarry Smith```console 11849b92b1d3SBarry Smith$ ./configure --download-superlu_dist --download-parmetis --download-metis --with-openmp 11859b92b1d3SBarry Smith``` 11869b92b1d3SBarry Smith 11879b92b1d3SBarry SmithYour compiler must support OpenMP. To have the linear solver run in parallel run your 11889b92b1d3SBarry Smithprogram with 11899b92b1d3SBarry Smith 11909b92b1d3SBarry Smith```console 11919b92b1d3SBarry Smith$ OMP_NUM_THREADS=n ./myprog -pc_type lu -pc_factor_mat_solver superlu_dist 11929b92b1d3SBarry Smith``` 11939b92b1d3SBarry Smith 11949b92b1d3SBarry Smithwhere `n` is the number of threads and should be less than or equal to the number of cores 11959b92b1d3SBarry Smithavailable. 11969b92b1d3SBarry Smith 11979b92b1d3SBarry Smith:::{note} 11989b92b1d3SBarry SmithIf your code is MPI parallel you can also use these same options to have SuperLU_dist 11999b92b1d3SBarry Smithutilize multiple threads per MPI process for the direct solver. Make sure that the 12009b92b1d3SBarry Smith`$OMP_NUM_THREADS` you use per MPI process is less than or equal to the number of 12019b92b1d3SBarry Smithcores available for each MPI process. For example if your compute nodes have 6 cores 12029b92b1d3SBarry Smithand you use 2 MPI processes per node then set `$OMP_NUM_THREADS` to 2 or 3. 12039b92b1d3SBarry Smith::: 12049b92b1d3SBarry Smith 12059b92b1d3SBarry SmithAnother approach that allows using a PETSc parallel solver is to use `PCMPI`. 12069b92b1d3SBarry Smith 12079b92b1d3SBarry Smith### TS or SNES produces infeasible (out of domain) solutions or states. How can I prevent this? 12089b92b1d3SBarry Smith 12099b92b1d3SBarry SmithFor `TS` call `TSSetFunctionDomainError()`. For both `TS` and `SNES` call 12109b92b1d3SBarry Smith`SNESSetFunctionDomainError()` when the solver passes an infeasible (out of domain) 12119b92b1d3SBarry Smithsolution or state to your routines. 12129b92b1d3SBarry Smith 12139b92b1d3SBarry SmithIf it occurs for DAEs, it is important to insure the algebraic constraints are well 12149b92b1d3SBarry Smithsatisfied, which can prevent "breakdown" later. Thus, one can try using a tight tolerance 12159b92b1d3SBarry Smithfor `SNES`, using a direct linear solver (`PCType` of `PCLU`) when possible, and reducing the timestep (or 12169b92b1d3SBarry Smithtightening `TS` tolerances for adaptive time stepping). 12179b92b1d3SBarry Smith 12189b92b1d3SBarry Smith### Can PETSc work with Hermitian matrices? 12199b92b1d3SBarry Smith 12209b92b1d3SBarry SmithPETSc's support of Hermitian matrices is limited. Many operations and solvers work 12219b92b1d3SBarry Smithfor symmetric (`MATSBAIJ`) matrices and operations on transpose matrices but there is 12229b92b1d3SBarry Smithlittle direct support for Hermitian matrices and Hermitian transpose (complex conjugate 12239b92b1d3SBarry Smithtranspose) operations. There is `KSPSolveTranspose()` for solving the transpose of a 12249b92b1d3SBarry Smithlinear system but no `KSPSolveHermitian()`. 12259b92b1d3SBarry Smith 12269b92b1d3SBarry SmithFor creating known Hermitian matrices: 12279b92b1d3SBarry Smith 12289b92b1d3SBarry Smith- `MatCreateNormalHermitian()` 12299b92b1d3SBarry Smith- `MatCreateHermitianTranspose()` 12309b92b1d3SBarry Smith 12319b92b1d3SBarry SmithFor determining or setting Hermitian status on existing matrices: 12329b92b1d3SBarry Smith 12339b92b1d3SBarry Smith- `MatIsHermitian()` 12349b92b1d3SBarry Smith- `MatIsHermitianKnown()` 12359b92b1d3SBarry Smith- `MatIsStructurallySymmetric()` 12369b92b1d3SBarry Smith- `MatIsSymmetricKnown()` 12379b92b1d3SBarry Smith- `MatIsSymmetric()` 12389b92b1d3SBarry Smith- `MatSetOption()` (use with `MAT_SYMMETRIC` or `MAT_HERMITIAN` to assert to PETSc 12399b92b1d3SBarry Smith that either is the case). 12409b92b1d3SBarry Smith 12419b92b1d3SBarry SmithFor performing matrix operations on known Hermitian matrices (note that regular `Mat` 12429b92b1d3SBarry Smithfunctions such as `MatMult()` will of course also work on Hermitian matrices): 12439b92b1d3SBarry Smith 12449b92b1d3SBarry Smith- `MatMultHermitianTranspose()` 12459b92b1d3SBarry Smith- `MatMultHermitianTransposeAdd()` (very limited support) 12469b92b1d3SBarry Smith 12479b92b1d3SBarry Smith### How can I assemble a bunch of similar matrices? 12489b92b1d3SBarry Smith 12499b92b1d3SBarry SmithYou can first add the values common to all the matrices, then use `MatStoreValues()` to 12509b92b1d3SBarry Smithstash the common values. Each iteration you call `MatRetrieveValues()`, then set the 12519b92b1d3SBarry Smithunique values in a matrix and assemble. 12529b92b1d3SBarry Smith 12539b92b1d3SBarry Smith### Can one resize or change the size of PETSc matrices or vectors? 12549b92b1d3SBarry Smith 12559b92b1d3SBarry SmithNo, once the vector or matrices sizes have been set and the matrices or vectors are fully 12569b92b1d3SBarry Smithusable one cannot change the size of the matrices or vectors or number of processors they 12579b92b1d3SBarry Smithlive on. One may create new vectors and copy, for example using `VecScatterCreate()`, 12589b92b1d3SBarry Smiththe old values from the previous vector. 12599b92b1d3SBarry Smith 12609b92b1d3SBarry Smith### How can one compute the nullspace of a sparse matrix with MUMPS? 12619b92b1d3SBarry Smith 12629b92b1d3SBarry SmithAssuming you have an existing matrix $A$ whose nullspace $V$ you want to find: 12639b92b1d3SBarry Smith 12649b92b1d3SBarry Smith``` 12659b92b1d3SBarry SmithMat F, work, V; 12669b92b1d3SBarry SmithPetscInt N, rows; 12679b92b1d3SBarry Smith 12689b92b1d3SBarry Smith/* Determine factorability */ 12699b92b1d3SBarry SmithPetscCall(MatGetFactor(A, MATSOLVERMUMPS, MAT_FACTOR_LU, &F)); 12709b92b1d3SBarry SmithPetscCall(MatGetLocalSize(A, &rows, NULL)); 12719b92b1d3SBarry Smith 12729b92b1d3SBarry Smith/* Set MUMPS options, see MUMPS documentation for more information */ 12739b92b1d3SBarry SmithPetscCall(MatMumpsSetIcntl(F, 24, 1)); 12749b92b1d3SBarry SmithPetscCall(MatMumpsSetIcntl(F, 25, 1)); 12759b92b1d3SBarry Smith 12769b92b1d3SBarry Smith/* Perform factorization */ 12779b92b1d3SBarry SmithPetscCall(MatLUFactorSymbolic(F, A, NULL, NULL, NULL)); 12789b92b1d3SBarry SmithPetscCall(MatLUFactorNumeric(F, A, NULL)); 12799b92b1d3SBarry Smith 12809b92b1d3SBarry Smith/* This is the dimension of the null space */ 12819b92b1d3SBarry SmithPetscCall(MatMumpsGetInfog(F, 28, &N)); 12829b92b1d3SBarry Smith 12839b92b1d3SBarry Smith/* This will contain the null space in the columns */ 12849b92b1d3SBarry SmithPetscCall(MatCreateDense(comm, rows, N, PETSC_DETERMINE, PETSC_DETERMINE, NULL, &V)); 12859b92b1d3SBarry Smith 12869b92b1d3SBarry SmithPetscCall(MatDuplicate(V, MAT_DO_NOT_COPY_VALUES, &work)); 12879b92b1d3SBarry SmithPetscCall(MatMatSolve(F, work, V)); 12889b92b1d3SBarry Smith``` 12899b92b1d3SBarry Smith 12909b92b1d3SBarry Smith______________________________________________________________________ 12919b92b1d3SBarry Smith 12929b92b1d3SBarry Smith## Execution 12939b92b1d3SBarry Smith 12949b92b1d3SBarry Smith### PETSc executables are SO big and take SO long to link 12959b92b1d3SBarry Smith 12969b92b1d3SBarry Smith:::{note} 12979b92b1d3SBarry SmithSee {ref}`shared libraries section <doc_faq_sharedlibs>` for more information. 12989b92b1d3SBarry Smith::: 12999b92b1d3SBarry Smith 13009b92b1d3SBarry SmithWe find this annoying as well. On most machines PETSc can use shared libraries, so 13019b92b1d3SBarry Smithexecutables should be much smaller, run `configure` with the additional option 13029b92b1d3SBarry Smith`--with-shared-libraries` (this is the default). Also, if you have room, compiling and 13039b92b1d3SBarry Smithlinking PETSc on your machine's `/tmp` disk or similar local disk, rather than over the 13049b92b1d3SBarry Smithnetwork will be much faster. 13059b92b1d3SBarry Smith 13069b92b1d3SBarry Smith### How does PETSc's `-help` option work? Why is it different for different programs? 13079b92b1d3SBarry Smith 13089b92b1d3SBarry SmithThere are 2 ways in which one interacts with the options database: 13099b92b1d3SBarry Smith 13109b92b1d3SBarry Smith- `PetscOptionsGetXXX()` where `XXX` is some type or data structure (for example 13119b92b1d3SBarry Smith `PetscOptionsGetBool()` or `PetscOptionsGetScalarArray()`). This is a classic 13129b92b1d3SBarry Smith "getter" function, which queries the command line options for a matching option name, 13139b92b1d3SBarry Smith and returns the specified value. 13149b92b1d3SBarry Smith- `PetscOptionsXXX()` where `XXX` is some type or data structure (for example 13159b92b1d3SBarry Smith `PetscOptionsBool()` or `PetscOptionsScalarArray()`). This is a so-called "provider" 13169b92b1d3SBarry Smith function. It first records the option name in an internal list of previously encountered 13179b92b1d3SBarry Smith options before calling `PetscOptionsGetXXX()` to query the status of said option. 13189b92b1d3SBarry Smith 13199b92b1d3SBarry SmithWhile users generally use the first option, developers will *always* use the second 13209b92b1d3SBarry Smith(provider) variant of functions. Thus, as the program runs, it will build up a list of 13219b92b1d3SBarry Smithencountered option names which are then printed **in the order of their appearance on the 13229b92b1d3SBarry Smithroot rank**. Different programs may take different paths through PETSc source code, so 13239b92b1d3SBarry Smiththey will encounter different providers, and therefore have different `-help` output. 13249b92b1d3SBarry Smith 13259b92b1d3SBarry Smith### PETSc has so many options for my program that it is hard to keep them straight 13269b92b1d3SBarry Smith 13279b92b1d3SBarry SmithRunning the PETSc program with the option `-help` will print out many of the options. To 13289b92b1d3SBarry Smithprint the options that have been specified within a program, employ `-options_left` to 13299b92b1d3SBarry Smithprint any options that the user specified but were not actually used by the program and 13309b92b1d3SBarry Smithall options used; this is helpful for detecting typo errors. The PETSc website has a search option, 13319b92b1d3SBarry Smithin the upper right hand corner, that quickly finds answers to most PETSc questions. 13329b92b1d3SBarry Smith 13339b92b1d3SBarry Smith### PETSc automatically handles many of the details in parallel PDE solvers. How can I understand what is really happening within my program? 13349b92b1d3SBarry Smith 13359b92b1d3SBarry SmithYou can use the option `-info` to get more details about the solution process. The 13369b92b1d3SBarry Smithoption `-log_view` provides details about the distribution of time spent in the various 13379b92b1d3SBarry Smithphases of the solution process. You can run with `-ts_view` or `-snes_view` or 13389b92b1d3SBarry Smith`-ksp_view` to see what solver options are being used. Run with `-ts_monitor`, 13399b92b1d3SBarry Smith`-snes_monitor`, or `-ksp_monitor` to watch convergence of the 13409b92b1d3SBarry Smithmethods. `-snes_converged_reason` and `-ksp_converged_reason` will indicate why and if 13419b92b1d3SBarry Smiththe solvers have converged. 13429b92b1d3SBarry Smith 13439b92b1d3SBarry Smith### Assembling large sparse matrices takes a long time. What can I do to make this process faster? Or MatSetValues() is so slow; what can I do to speed it up? 13449b92b1d3SBarry Smith 13459b92b1d3SBarry SmithYou probably need to do preallocation, as explained in {any}`sec_matsparse`. 13469b92b1d3SBarry SmithSee also the {ref}`performance chapter <ch_performance>` of the users manual. 13479b92b1d3SBarry Smith 13489b92b1d3SBarry SmithFor GPUs (and even CPUs) you can use `MatSetPreallocationCOO()` and `MatSetValuesCOO()` for more rapid assembly. 13499b92b1d3SBarry Smith 13509b92b1d3SBarry Smith### How can I generate performance summaries with PETSc? 13519b92b1d3SBarry Smith 13529b92b1d3SBarry SmithUse this option at runtime: 13539b92b1d3SBarry Smith 13549b92b1d3SBarry Smith-l 13559b92b1d3SBarry Smith 13569b92b1d3SBarry Smithog_view 13579b92b1d3SBarry Smith 13589b92b1d3SBarry SmithOutputs a comprehensive timing, memory consumption, and communications digest 13599b92b1d3SBarry Smithfor your program. See the {ref}`profiling chapter <ch_profiling>` of the users 13609b92b1d3SBarry Smithmanual for information on interpreting the summary data. 13619b92b1d3SBarry Smith 13629b92b1d3SBarry Smith### How do I know the amount of time spent on each level of the multigrid solver/preconditioner? 13639b92b1d3SBarry Smith 13649b92b1d3SBarry SmithRun with `-log_view` and `-pc_mg_log` 13659b92b1d3SBarry Smith 13669b92b1d3SBarry Smith### Where do I get the input matrices for the examples? 13679b92b1d3SBarry Smith 13689b92b1d3SBarry SmithSome examples use `$DATAFILESPATH/matrices/medium` and other files. These test matrices 13699b92b1d3SBarry Smithin PETSc binary format can be found in the [datafiles repository](https://gitlab.com/petsc/datafiles). 13709b92b1d3SBarry Smith 13719b92b1d3SBarry Smith### When I dump some matrices and vectors to binary, I seem to be generating some empty files with `.info` extensions. What's the deal with these? 13729b92b1d3SBarry Smith 13739b92b1d3SBarry SmithPETSc binary viewers put some additional information into `.info` files like matrix 13749b92b1d3SBarry Smithblock size. It is harmless but if you *really* don't like it you can use 13759b92b1d3SBarry Smith`-viewer_binary_skip_info` or `PetscViewerBinarySkipInfo()`. 13769b92b1d3SBarry Smith 13779b92b1d3SBarry Smith:::{note} 13789b92b1d3SBarry SmithYou need to call `PetscViewerBinarySkipInfo()` before 13799b92b1d3SBarry Smith`PetscViewerFileSetName()`. In other words you **cannot** use 13809b92b1d3SBarry Smith`PetscViewerBinaryOpen()` directly. 13819b92b1d3SBarry Smith::: 13829b92b1d3SBarry Smith 13839b92b1d3SBarry Smith### Why is my parallel solver slower than my sequential solver, or I have poor speed-up? 13849b92b1d3SBarry Smith 13859b92b1d3SBarry SmithThis can happen for many reasons: 13869b92b1d3SBarry Smith 13879b92b1d3SBarry Smith1. Make sure it is truly the time in `KSPSolve()` that is slower (by running the code 13889b92b1d3SBarry Smith with `-log_view`). Often the slower time is in generating the matrix or some other 13899b92b1d3SBarry Smith operation. 13909b92b1d3SBarry Smith2. There must be enough work for each process to outweigh the communication time. We 13919b92b1d3SBarry Smith recommend an absolute minimum of about 10,000 unknowns per process, better is 20,000 or 13929b92b1d3SBarry Smith more. This is even more true when using multiple GPUs, where you need to have millions 13939b92b1d3SBarry Smith of unknowns per GPU. 13949b92b1d3SBarry Smith3. Make sure the {ref}`communication speed of the parallel computer 13959b92b1d3SBarry Smith <doc_faq_general_parallel>` is good enough for parallel solvers. 13969b92b1d3SBarry Smith4. Check the number of solver iterates with the parallel solver against the sequential 13979b92b1d3SBarry Smith solver. Most preconditioners require more iterations when used on more processes, this 13989b92b1d3SBarry Smith is particularly true for block Jacobi (the default parallel preconditioner). You can 13999b92b1d3SBarry Smith try `-pc_type asm` (`PCASM`) its iterations scale a bit better for more 14009b92b1d3SBarry Smith processes. You may also consider multigrid preconditioners like `PCMG` or BoomerAMG 14019b92b1d3SBarry Smith in `PCHYPRE`. 14029b92b1d3SBarry Smith 14039b92b1d3SBarry Smith(doc_faq_pipelined)= 14049b92b1d3SBarry Smith 14059b92b1d3SBarry Smith### What steps are necessary to make the pipelined solvers execute efficiently? 14069b92b1d3SBarry Smith 14079b92b1d3SBarry SmithPipelined solvers like `KSPPGMRES`, `KSPPIPECG`, `KSPPIPECR`, and `KSPGROPPCG` may 14089b92b1d3SBarry Smithrequire special MPI configuration to effectively overlap reductions with computation. In 14099b92b1d3SBarry Smithgeneral, this requires an MPI-3 implementation, an implementation that supports multiple 14109b92b1d3SBarry Smiththreads, and use of a "progress thread". Consult the documentation from your vendor or 14119b92b1d3SBarry Smithcomputing facility for more details. 14129b92b1d3SBarry Smith 14139b92b1d3SBarry Smith- Cray MPI MPT-5.6 MPI-3, by setting `$MPICH_MAX_THREAD_SAFETY` to "multiple" 14149b92b1d3SBarry Smith for threads, plus either `$MPICH_ASYNC_PROGRESS` or 14159b92b1d3SBarry Smith `$MPICH_NEMESIS_ASYNC_PROGRESS`. E.g. 14169b92b1d3SBarry Smith ```console 14179b92b1d3SBarry Smith $ export MPICH_MAX_THREAD_SAFETY=multiple 14189b92b1d3SBarry Smith $ export MPICH_ASYNC_PROGRESS=1 14199b92b1d3SBarry Smith $ export MPICH_NEMESIS_ASYNC_PROGRESS=1 14209b92b1d3SBarry Smith ``` 14219b92b1d3SBarry Smith 14229b92b1d3SBarry Smith- MPICH version 3.0 and later implements the MPI-3 standard and the default 14239b92b1d3SBarry Smith configuration supports use of threads. Use of a progress thread is configured by 14249b92b1d3SBarry Smith setting the environment variable `$MPICH_ASYNC_PROGRESS`. E.g. 14259b92b1d3SBarry Smith ```console 14269b92b1d3SBarry Smith $ export MPICH_ASYNC_PROGRESS=1 14279b92b1d3SBarry Smith ``` 14289b92b1d3SBarry Smith 14299b92b1d3SBarry Smith### When using PETSc in single precision mode (`--with-precision=single` when running `configure`) are the operations done in single or double precision? 14309b92b1d3SBarry Smith 14319b92b1d3SBarry SmithPETSc does **NOT** do any explicit conversion of single precision to double before 14329b92b1d3SBarry Smithperforming computations; it depends on the hardware and compiler for what happens. For 14339b92b1d3SBarry Smithexample, the compiler could choose to put the single precision numbers into the usual 14349b92b1d3SBarry Smithdouble precision registers and then use the usual double precision floating point unit. Or 14359b92b1d3SBarry Smithit could use SSE2 instructions that work directly on the single precision numbers. It is a 14369b92b1d3SBarry Smithbit of a mystery what decisions get made sometimes. There may be compiler flags in some 14379b92b1d3SBarry Smithcircumstances that can affect this. 14389b92b1d3SBarry Smith 14399b92b1d3SBarry Smith### Why is Newton's method (SNES) not converging, or converges slowly? 14409b92b1d3SBarry Smith 14419b92b1d3SBarry SmithNewton's method may not converge for many reasons, here are some of the most common: 14429b92b1d3SBarry Smith 14439b92b1d3SBarry Smith- The Jacobian is wrong (or correct in sequential but not in parallel). 14449b92b1d3SBarry Smith- The linear system is {ref}`not solved <doc_faq_execution_kspconv>` or is not solved 14459b92b1d3SBarry Smith accurately enough. 14469b92b1d3SBarry Smith- The Jacobian system has a singularity that the linear solver is not handling. 14479b92b1d3SBarry Smith- There is a bug in the function evaluation routine. 14489b92b1d3SBarry Smith- The function is not continuous or does not have continuous first derivatives (e.g. phase 14499b92b1d3SBarry Smith change or TVD limiters). 14509b92b1d3SBarry Smith- The equations may not have a solution (e.g. limit cycle instead of a steady state) or 14519b92b1d3SBarry Smith there may be a "hill" between the initial guess and the steady state (e.g. reactants 14529b92b1d3SBarry Smith must ignite and burn before reaching a steady state, but the steady-state residual will 14539b92b1d3SBarry Smith be larger during combustion). 14549b92b1d3SBarry Smith 14559b92b1d3SBarry SmithHere are some of the ways to help debug lack of convergence of Newton: 14569b92b1d3SBarry Smith 14579b92b1d3SBarry Smith- Run on one processor to see if the problem is only in parallel. 14589b92b1d3SBarry Smith 14599b92b1d3SBarry Smith- Run with `-info` to get more detailed information on the solution process. 14609b92b1d3SBarry Smith 14619b92b1d3SBarry Smith- Run with the options 14629b92b1d3SBarry Smith 14639b92b1d3SBarry Smith ```text 14649b92b1d3SBarry Smith -snes_monitor -ksp_monitor_true_residual -snes_converged_reason -ksp_converged_reason 14659b92b1d3SBarry Smith ``` 14669b92b1d3SBarry Smith 14679b92b1d3SBarry Smith - If the linear solve does not converge, check if the Jacobian is correct, then see 14689b92b1d3SBarry Smith {ref}`this question <doc_faq_execution_kspconv>`. 14699b92b1d3SBarry Smith - If the preconditioned residual converges, but the true residual does not, the 14709b92b1d3SBarry Smith preconditioner may be singular. 14719b92b1d3SBarry Smith - If the linear solve converges well, but the line search fails, the Jacobian may be 14729b92b1d3SBarry Smith incorrect. 14739b92b1d3SBarry Smith 14749b92b1d3SBarry Smith- Run with `-pc_type lu` or `-pc_type svd` to see if the problem is a poor linear 14759b92b1d3SBarry Smith solver. 14769b92b1d3SBarry Smith 14779b92b1d3SBarry Smith- Run with `-mat_view` or `-mat_view draw` to see if the Jacobian looks reasonable. 14789b92b1d3SBarry Smith 14799b92b1d3SBarry Smith- Run with `-snes_test_jacobian -snes_test_jacobian_view` to see if the Jacobian you are 14809b92b1d3SBarry Smith using is wrong. Compare the output when you add `-mat_fd_type ds` to see if the result 14819b92b1d3SBarry Smith is sensitive to the choice of differencing parameter. 14829b92b1d3SBarry Smith 14839b92b1d3SBarry Smith- Run with `-snes_mf_operator -pc_type lu` to see if the Jacobian you are using is 14849b92b1d3SBarry Smith wrong. If the problem is too large for a direct solve, try 14859b92b1d3SBarry Smith 14869b92b1d3SBarry Smith ```text 14879b92b1d3SBarry Smith -snes_mf_operator -pc_type ksp -ksp_ksp_rtol 1e-12. 14889b92b1d3SBarry Smith ``` 14899b92b1d3SBarry Smith 14909b92b1d3SBarry Smith Compare the output when you add `-mat_mffd_type ds` to see if the result is sensitive 14919b92b1d3SBarry Smith to choice of differencing parameter. 14929b92b1d3SBarry Smith 14939b92b1d3SBarry Smith- Run with `-snes_linesearch_monitor` to see if the line search is failing (this is 14949b92b1d3SBarry Smith usually a sign of a bad Jacobian). Use `-info` in PETSc 3.1 and older versions, 14959b92b1d3SBarry Smith `-snes_ls_monitor` in PETSc 3.2 and `-snes_linesearch_monitor` in PETSc 3.3 and 14969b92b1d3SBarry Smith later. 14979b92b1d3SBarry Smith 14989b92b1d3SBarry SmithHere are some ways to help the Newton process if everything above checks out: 14999b92b1d3SBarry Smith 15009b92b1d3SBarry Smith- Run with grid sequencing (`-snes_grid_sequence` if working with a `DM` is all you 15019b92b1d3SBarry Smith need) to generate better initial guess on your finer mesh. 15029b92b1d3SBarry Smith 15039b92b1d3SBarry Smith- Run with quad precision, i.e. 15049b92b1d3SBarry Smith 15059b92b1d3SBarry Smith ```console 15069b92b1d3SBarry Smith $ ./configure --with-precision=__float128 --download-f2cblaslapack 15079b92b1d3SBarry Smith ``` 15089b92b1d3SBarry Smith 15099b92b1d3SBarry Smith :::{note} 15109b92b1d3SBarry Smith quad precision requires PETSc 3.2 and later and recent versions of the GNU compilers. 15119b92b1d3SBarry Smith ::: 15129b92b1d3SBarry Smith 15139b92b1d3SBarry Smith- Change the units (nondimensionalization), boundary condition scaling, or formulation so 15149b92b1d3SBarry Smith that the Jacobian is better conditioned. See [Buckingham pi theorem](https://en.wikipedia.org/wiki/Buckingham_%CF%80_theorem) and [Dimensional and 15159b92b1d3SBarry Smith Scaling Analysis](https://epubs.siam.org/doi/pdf/10.1137/16M1107127). 15169b92b1d3SBarry Smith 15179b92b1d3SBarry Smith- Mollify features in the function that do not have continuous first derivatives (often 15189b92b1d3SBarry Smith occurs when there are "if" statements in the residual evaluation, e.g. phase change or 15199b92b1d3SBarry Smith TVD limiters). Use a variational inequality solver (`SNESVINEWTONRSLS`) if the 15209b92b1d3SBarry Smith discontinuities are of fundamental importance. 15219b92b1d3SBarry Smith 15229b92b1d3SBarry Smith- Try a trust region method (`-ts_type tr`, may have to adjust parameters). 15239b92b1d3SBarry Smith 15249b92b1d3SBarry Smith- Run with some continuation parameter from a point where you know the solution, see 15259b92b1d3SBarry Smith `TSPSEUDO` for steady-states. 15269b92b1d3SBarry Smith 15279b92b1d3SBarry Smith- There are homotopy solver packages like PHCpack that can get you all possible solutions 15289b92b1d3SBarry Smith (and tell you that it has found them all) but those are not scalable and cannot solve 15299b92b1d3SBarry Smith anything but small problems. 15309b92b1d3SBarry Smith 15319b92b1d3SBarry Smith(doc_faq_execution_kspconv)= 15329b92b1d3SBarry Smith 15339b92b1d3SBarry Smith### Why is the linear solver (KSP) not converging, or converges slowly? 15349b92b1d3SBarry Smith 15359b92b1d3SBarry Smith:::{tip} 15369b92b1d3SBarry SmithAlways run with `-ksp_converged_reason -ksp_monitor_true_residual` when trying to 15379b92b1d3SBarry Smithlearn why a method is not converging! 15389b92b1d3SBarry Smith::: 15399b92b1d3SBarry Smith 15409b92b1d3SBarry SmithCommon reasons for KSP not converging are: 15419b92b1d3SBarry Smith 15429b92b1d3SBarry Smith- A symmetric method is being used for a non-symmetric problem. 15439b92b1d3SBarry Smith 15449b92b1d3SBarry Smith- The equations are singular by accident (e.g. forgot to impose boundary 15459b92b1d3SBarry Smith conditions). Check this for a small problem using `-pc_type svd -pc_svd_monitor`. 15469b92b1d3SBarry Smith 15479b92b1d3SBarry Smith- The equations are intentionally singular (e.g. constant null space), but the Krylov 15489b92b1d3SBarry Smith method was not informed, see `MatSetNullSpace()`. Always inform your local Krylov 15499b92b1d3SBarry Smith subspace solver of any change of singularity. Failure to do so will result in the 15509b92b1d3SBarry Smith immediate revocation of your computing and keyboard operator licenses, as well as 15519b92b1d3SBarry Smith a stern talking-to by the nearest Krylov Subspace Method representative. 15529b92b1d3SBarry Smith 15539b92b1d3SBarry Smith- The equations are intentionally singular and `MatSetNullSpace()` was used, but the 15549b92b1d3SBarry Smith right-hand side is not consistent. You may have to call `MatNullSpaceRemove()` on the 15559b92b1d3SBarry Smith right-hand side before calling `KSPSolve()`. See `MatSetTransposeNullSpace()`. 15569b92b1d3SBarry Smith 15579b92b1d3SBarry Smith- The equations are indefinite so that standard preconditioners don't work. Usually you 15589b92b1d3SBarry Smith will know this from the physics, but you can check with 15599b92b1d3SBarry Smith 15609b92b1d3SBarry Smith ```text 15619b92b1d3SBarry Smith -ksp_compute_eigenvalues -ksp_gmres_restart 1000 -pc_type none 15629b92b1d3SBarry Smith ``` 15639b92b1d3SBarry Smith 15649b92b1d3SBarry Smith For simple saddle point problems, try 15659b92b1d3SBarry Smith 15669b92b1d3SBarry Smith ```text 15679b92b1d3SBarry Smith -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_detect_saddle_point 15689b92b1d3SBarry Smith ``` 15699b92b1d3SBarry Smith 15709b92b1d3SBarry Smith For more difficult problems, read the literature to find robust methods and ask 15719b92b1d3SBarry Smith <mailto:petsc-users@mcs.anl.gov> or <mailto:petsc-maint@mcs.anl.gov> if you want advice about how to 15729b92b1d3SBarry Smith implement them. 15739b92b1d3SBarry Smith 15749b92b1d3SBarry Smith- If the method converges in preconditioned residual, but not in true residual, the 15759b92b1d3SBarry Smith preconditioner is likely singular or nearly so. This is common for saddle point problems 15769b92b1d3SBarry Smith (e.g. incompressible flow) or strongly nonsymmetric operators (e.g. low-Mach hyperbolic 15779b92b1d3SBarry Smith problems with large time steps). 15789b92b1d3SBarry Smith 15799b92b1d3SBarry Smith- The preconditioner is too weak or is unstable. See if `-pc_type asm -sub_pc_type lu` 15809b92b1d3SBarry Smith improves the convergence rate. If GMRES is losing too much progress in the restart, see 15819b92b1d3SBarry Smith if longer restarts help `-ksp_gmres_restart 300`. If a transpose is available, try 15829b92b1d3SBarry Smith `-ksp_type bcgs` or other methods that do not require a restart. 15839b92b1d3SBarry Smith 15849b92b1d3SBarry Smith :::{note} 15859b92b1d3SBarry Smith Unfortunately convergence with these methods is frequently erratic. 15869b92b1d3SBarry Smith ::: 15879b92b1d3SBarry Smith 15889b92b1d3SBarry Smith- The preconditioner is nonlinear (e.g. a nested iterative solve), try `-ksp_type 15899b92b1d3SBarry Smith fgmres` or `-ksp_type gcr`. 15909b92b1d3SBarry Smith 15919b92b1d3SBarry Smith- You are using geometric multigrid, but some equations (often boundary conditions) are 15929b92b1d3SBarry Smith not scaled compatibly between levels. Try `-pc_mg_galerkin` both to algebraically 15939b92b1d3SBarry Smith construct a correctly scaled coarse operator or make sure that all the equations are 15949b92b1d3SBarry Smith scaled in the same way if you want to use rediscretized coarse levels. 15959b92b1d3SBarry Smith 15969b92b1d3SBarry Smith- The matrix is very ill-conditioned. Check the {ref}`condition number <doc_faq_usage_condnum>`. 15979b92b1d3SBarry Smith 15989b92b1d3SBarry Smith - Try to improve it by choosing the relative scaling of components/boundary conditions. 15999b92b1d3SBarry Smith - Try `-ksp_diagonal_scale -ksp_diagonal_scale_fix`. 16009b92b1d3SBarry Smith - Perhaps change the formulation of the problem to produce more friendly algebraic 16019b92b1d3SBarry Smith equations. 16029b92b1d3SBarry Smith 16039b92b1d3SBarry Smith- Change the units (nondimensionalization), boundary condition scaling, or formulation so 16049b92b1d3SBarry Smith that the Jacobian is better conditioned. See [Buckingham pi theorem](https://en.wikipedia.org/wiki/Buckingham_%CF%80_theorem) and [Dimensional and 16059b92b1d3SBarry Smith Scaling Analysis](https://epubs.siam.org/doi/pdf/10.1137/16M1107127). 16069b92b1d3SBarry Smith 16079b92b1d3SBarry Smith- Classical Gram-Schmidt is becoming unstable, try `-ksp_gmres_modifiedgramschmidt` or 16089b92b1d3SBarry Smith use a method that orthogonalizes differently, e.g. `-ksp_type gcr`. 16099b92b1d3SBarry Smith 16109b92b1d3SBarry Smith### I get the error message: Actual argument at (1) to assumed-type dummy is of derived type with type-bound or FINAL procedures 16119b92b1d3SBarry Smith 16129b92b1d3SBarry SmithUse the following code-snippet: 16139b92b1d3SBarry Smith 16149b92b1d3SBarry Smith```fortran 16159b92b1d3SBarry Smithmodule context_module 16169b92b1d3SBarry Smith#include petsc/finclude/petsc.h 16179b92b1d3SBarry Smithuse petsc 16189b92b1d3SBarry Smithimplicit none 16199b92b1d3SBarry Smithprivate 16209b92b1d3SBarry Smithtype, public :: context_type 16219b92b1d3SBarry Smith private 16229b92b1d3SBarry Smith PetscInt :: foo 16239b92b1d3SBarry Smithcontains 16249b92b1d3SBarry Smith procedure, public :: init => context_init 16259b92b1d3SBarry Smithend type context_type 16269b92b1d3SBarry Smithcontains 16279b92b1d3SBarry Smithsubroutine context_init(self, foo) 16289b92b1d3SBarry Smith class(context_type), intent(in out) :: self 16299b92b1d3SBarry Smith PetscInt, intent(in) :: foo 16309b92b1d3SBarry Smith self%foo = foo 16319b92b1d3SBarry Smithend subroutine context_init 16329b92b1d3SBarry Smithend module context_module 16339b92b1d3SBarry Smith 16349b92b1d3SBarry Smith!------------------------------------------------------------------------ 16359b92b1d3SBarry Smith 16369b92b1d3SBarry Smithprogram test_snes 16379b92b1d3SBarry Smithuse,intrinsic :: iso_c_binding 16389b92b1d3SBarry Smithuse petsc 16399b92b1d3SBarry Smithuse context_module 16409b92b1d3SBarry Smithimplicit none 16419b92b1d3SBarry Smith 16429b92b1d3SBarry SmithSNES :: snes 16439b92b1d3SBarry Smithtype(context_type),target :: context 16449b92b1d3SBarry Smithtype(c_ptr) :: contextOut 16459b92b1d3SBarry SmithPetscErrorCode :: ierr 16469b92b1d3SBarry Smith 16479b92b1d3SBarry Smithcall PetscInitialize(PETSC_NULL_CHARACTER, ierr) 16489b92b1d3SBarry Smithcall SNESCreate(PETSC_COMM_WORLD, snes, ierr) 16499b92b1d3SBarry Smithcall context%init(1) 16509b92b1d3SBarry Smith 16519b92b1d3SBarry SmithcontextOut = c_loc(context) ! contextOut is a C pointer on context 16529b92b1d3SBarry Smith 16539b92b1d3SBarry Smithcall SNESSetConvergenceTest(snes, convergence, contextOut, PETSC_NULL_FUNCTION, ierr) 16549b92b1d3SBarry Smith 16559b92b1d3SBarry Smithcall SNESDestroy(snes, ierr) 16569b92b1d3SBarry Smithcall PetscFinalize(ierr) 16579b92b1d3SBarry Smith 16589b92b1d3SBarry Smithcontains 16599b92b1d3SBarry Smith 16609b92b1d3SBarry Smithsubroutine convergence(snes, num_iterations, xnorm, pnorm,fnorm, reason, contextIn, ierr) 16619b92b1d3SBarry SmithSNES, intent(in) :: snes 16629b92b1d3SBarry Smith 16639b92b1d3SBarry SmithPetscInt, intent(in) :: num_iterations 16649b92b1d3SBarry SmithPetscReal, intent(in) :: xnorm, pnorm, fnorm 16659b92b1d3SBarry SmithSNESConvergedReason, intent(out) :: reason 16669b92b1d3SBarry Smithtype(c_ptr), intent(in out) :: contextIn 16679b92b1d3SBarry Smithtype(context_type), pointer :: context 16689b92b1d3SBarry SmithPetscErrorCode, intent(out) :: ierr 16699b92b1d3SBarry Smith 16709b92b1d3SBarry Smithcall c_f_pointer(contextIn,context) ! convert the C pointer to a Fortran pointer to use context as in the main program 16719b92b1d3SBarry Smithreason = 0 16729b92b1d3SBarry Smithierr = 0 16739b92b1d3SBarry Smithend subroutine convergence 16749b92b1d3SBarry Smithend program test_snes 16759b92b1d3SBarry Smith``` 16769b92b1d3SBarry Smith 16779b92b1d3SBarry Smith### In C++ I get a crash on `VecDestroy()` (or some other PETSc object) at the end of the program 16789b92b1d3SBarry Smith 16799b92b1d3SBarry SmithThis can happen when the destructor for a C++ class is automatically called at the end of 16809b92b1d3SBarry Smiththe program after `PetscFinalize()`. Use the following code-snippet: 16819b92b1d3SBarry Smith 16829b92b1d3SBarry Smith``` 16839b92b1d3SBarry Smithmain() 16849b92b1d3SBarry Smith{ 16859b92b1d3SBarry Smith PetscCall(PetscInitialize()); 16869b92b1d3SBarry Smith { 16879b92b1d3SBarry Smith your variables 16889b92b1d3SBarry Smith your code 16899b92b1d3SBarry Smith 16909b92b1d3SBarry Smith ... /* all your destructors are called here automatically by C++ so they work correctly */ 16919b92b1d3SBarry Smith } 16929b92b1d3SBarry Smith PetscCall(PetscFinalize()); 16939b92b1d3SBarry Smith return 0 16949b92b1d3SBarry Smith} 16959b92b1d3SBarry Smith``` 16969b92b1d3SBarry Smith 16979b92b1d3SBarry Smith______________________________________________________________________ 16989b92b1d3SBarry Smith 16999b92b1d3SBarry Smith## Debugging 17009b92b1d3SBarry Smith 17019b92b1d3SBarry Smith### What does the message hwloc/linux: Ignoring PCI device with non-16bit domain mean? 17029b92b1d3SBarry Smith 17039b92b1d3SBarry SmithThis is printed by the hwloc library that is used by some MPI implementations. It can be ignored. 17049b92b1d3SBarry SmithTo prevent the message from always being printed set the environmental variable `HWLOC_HIDE_ERRORS` to 2. 17059b92b1d3SBarry SmithFor example 17069b92b1d3SBarry Smith 17079b92b1d3SBarry Smith``` 17089b92b1d3SBarry Smithexport HWLOC_HIDE_ERRORS=2 17099b92b1d3SBarry Smith``` 17109b92b1d3SBarry Smith 17119b92b1d3SBarry Smithwhich can be added to your login profile file such as `~/.bashrc`. 17129b92b1d3SBarry Smith 17139b92b1d3SBarry Smith### How do I turn off PETSc signal handling so I can use the `-C` Option On `xlf`? 17149b92b1d3SBarry Smith 17159b92b1d3SBarry SmithImmediately after calling `PetscInitialize()` call `PetscPopSignalHandler()`. 17169b92b1d3SBarry Smith 17179b92b1d3SBarry SmithSome Fortran compilers including the IBM xlf, xlF etc compilers have a compile option 17189b92b1d3SBarry Smith(`-C` for IBM's) that causes all array access in Fortran to be checked that they are 17199b92b1d3SBarry Smithin-bounds. This is a great feature but does require that the array dimensions be set 17209b92b1d3SBarry Smithexplicitly, not with a \*. 17219b92b1d3SBarry Smith 17229b92b1d3SBarry Smith### How do I debug if `-start_in_debugger` does not work on my machine? 17239b92b1d3SBarry Smith 17249b92b1d3SBarry SmithThe script <https://github.com/Azrael3000/tmpi> can be used to run multiple MPI 17259b92b1d3SBarry Smithranks in the debugger using tmux. 17269b92b1d3SBarry Smith 17279b92b1d3SBarry SmithOn newer macOS machines - one has to be in admin group to be able to use debugger. 17289b92b1d3SBarry Smith 17299b92b1d3SBarry SmithOn newer Ubuntu linux machines - one has to disable `ptrace_scope` with 17309b92b1d3SBarry Smith 17319b92b1d3SBarry Smith```console 17329b92b1d3SBarry Smith$ sudo echo 0 > /proc/sys/kernel/yama/ptrace_scope 17339b92b1d3SBarry Smith``` 17349b92b1d3SBarry Smith 17359b92b1d3SBarry Smithto get start in debugger working. 17369b92b1d3SBarry Smith 17379b92b1d3SBarry SmithIf `-start_in_debugger` does not work on your OS, for a uniprocessor job, just 17389b92b1d3SBarry Smithtry the debugger directly, for example: `gdb ex1`. You can also use [TotalView](https://totalview.io/products/totalview) which is a good graphical parallel debugger. 17399b92b1d3SBarry Smith 17409b92b1d3SBarry Smith### How do I see where my code is hanging? 17419b92b1d3SBarry Smith 17429b92b1d3SBarry SmithYou can use the `-start_in_debugger` option to start all processes in the debugger (each 17439b92b1d3SBarry Smithwill come up in its own xterm) or run in [TotalView](https://totalview.io/products/totalview). Then use `cont` (for continue) in each 17449b92b1d3SBarry Smithxterm. Once you are sure that the program is hanging, hit control-c in each xterm and then 17459b92b1d3SBarry Smithuse 'where' to print a stack trace for each process. 17469b92b1d3SBarry Smith 17479b92b1d3SBarry Smith### How can I inspect PETSc vector and matrix values when in the debugger? 17489b92b1d3SBarry Smith 17499b92b1d3SBarry SmithI will illustrate this with `gdb`, but it should be similar on other debuggers. You can 17509b92b1d3SBarry Smithlook at local `Vec` values directly by obtaining the array. For a `Vec` v, we can 17519b92b1d3SBarry Smithprint all local values using: 17529b92b1d3SBarry Smith 17539b92b1d3SBarry Smith```console 17549b92b1d3SBarry Smith(gdb) p ((Vec_Seq*) v->data)->array[0]@v->map.n 17559b92b1d3SBarry Smith``` 17569b92b1d3SBarry Smith 17579b92b1d3SBarry SmithHowever, this becomes much more complicated for a matrix. Therefore, it is advisable to use the default viewer to look at the object. For a `Vec` v and a `Mat` m, this would be: 17589b92b1d3SBarry Smith 17599b92b1d3SBarry Smith```console 17609b92b1d3SBarry Smith(gdb) call VecView(v, 0) 17619b92b1d3SBarry Smith(gdb) call MatView(m, 0) 17629b92b1d3SBarry Smith``` 17639b92b1d3SBarry Smith 17649b92b1d3SBarry Smithor with a communicator other than `MPI_COMM_WORLD`: 17659b92b1d3SBarry Smith 17669b92b1d3SBarry Smith```console 17679b92b1d3SBarry Smith(gdb) call MatView(m, PETSC_VIEWER_STDOUT_(m->comm)) 17689b92b1d3SBarry Smith``` 17699b92b1d3SBarry Smith 17709b92b1d3SBarry SmithTotalview 8.8.0+ has a new feature that allows libraries to provide their own code to 17719b92b1d3SBarry Smithdisplay objects in the debugger. Thus in theory each PETSc object, `Vec`, `Mat` etc 17729b92b1d3SBarry Smithcould have custom code to print values in the object. We have only done this for the most 17739b92b1d3SBarry Smithelementary display of `Vec` and `Mat`. See the routine `TV_display_type()` in 17749b92b1d3SBarry Smith`src/vec/vec/interface/vector.c` for an example of how these may be written. Contact us 17759b92b1d3SBarry Smithif you would like to add more. 17769b92b1d3SBarry Smith 17779b92b1d3SBarry Smith### How can I find the cause of floating point exceptions like not-a-number (NaN) or infinity? 17789b92b1d3SBarry Smith 17799b92b1d3SBarry SmithThe best way to locate floating point exceptions is to use a debugger. On supported 17809b92b1d3SBarry Smitharchitectures (including Linux and glibc-based systems), just run in a debugger and pass 17819b92b1d3SBarry Smith`-fp_trap` to the PETSc application. This will activate signaling exceptions and the 17829b92b1d3SBarry Smithdebugger will break on the line that first divides by zero or otherwise generates an 17839b92b1d3SBarry Smithexceptions. 17849b92b1d3SBarry Smith 17859b92b1d3SBarry SmithWithout a debugger, running with `-fp_trap` in debug mode will only identify the 17869b92b1d3SBarry Smithfunction in which the error occurred, but not the line or the type of exception. If 17879b92b1d3SBarry Smith`-fp_trap` is not supported on your architecture, consult the documentation for your 17889b92b1d3SBarry Smithdebugger since there is likely a way to have it catch exceptions. 17899b92b1d3SBarry Smith 17909b92b1d3SBarry Smith(error_libimf)= 17919b92b1d3SBarry Smith 17929b92b1d3SBarry Smith### Error while loading shared libraries: libimf.so: cannot open shared object file: No such file or directory 17939b92b1d3SBarry Smith 17949b92b1d3SBarry SmithThe Intel compilers use shared libraries (like libimf) that cannot be found, by default, at run 17959b92b1d3SBarry Smithtime. When using the Intel compilers (and running the resulting code) you must make sure 17969b92b1d3SBarry Smiththat the proper Intel initialization scripts are run. This is usually done by adding some 17979b92b1d3SBarry Smithcode into your `.cshrc`, `.bashrc`, `.profile` etc file. Sometimes on batch file 17989b92b1d3SBarry Smithsystems that do now access your initialization files (like .cshrc) you must include the 17999b92b1d3SBarry Smithinitialization calls in your batch file submission. 18009b92b1d3SBarry Smith 18019b92b1d3SBarry SmithFor example, on my Mac using `csh` I have the following in my `.cshrc` file: 18029b92b1d3SBarry Smith 18039b92b1d3SBarry Smith```csh 18049b92b1d3SBarry Smithsource /opt/intel/cc/10.1.012/bin/iccvars.csh 18059b92b1d3SBarry Smithsource /opt/intel/fc/10.1.012/bin/ifortvars.csh 18069b92b1d3SBarry Smithsource /opt/intel/idb/10.1.012/bin/idbvars.csh 18079b92b1d3SBarry Smith``` 18089b92b1d3SBarry Smith 18099b92b1d3SBarry SmithAnd in my `.profile` I have 18109b92b1d3SBarry Smith 18119b92b1d3SBarry Smith```csh 18129b92b1d3SBarry Smithsource /opt/intel/cc/10.1.012/bin/iccvars.sh 18139b92b1d3SBarry Smithsource /opt/intel/fc/10.1.012/bin/ifortvars.sh 18149b92b1d3SBarry Smithsource /opt/intel/idb/10.1.012/bin/idbvars.sh 18159b92b1d3SBarry Smith``` 18169b92b1d3SBarry Smith 18179b92b1d3SBarry Smith(object_type_not_set)= 18189b92b1d3SBarry Smith 18199b92b1d3SBarry Smith### What does "Object Type Not Set: Argument # N" Mean? 18209b92b1d3SBarry Smith 18219b92b1d3SBarry SmithMany operations on PETSc objects require that the specific type of the object be set before the operations is performed. You must call `XXXSetType()` or `XXXSetFromOptions()` before you make the offending call. For example 18229b92b1d3SBarry Smith 18239b92b1d3SBarry Smith``` 18249b92b1d3SBarry SmithMat A; 18259b92b1d3SBarry Smith 18269b92b1d3SBarry SmithPetscCall(MatCreate(PETSC_COMM_WORLD, &A)); 18279b92b1d3SBarry SmithPetscCall(MatSetValues(A,....)); 18289b92b1d3SBarry Smith``` 18299b92b1d3SBarry Smith 18309b92b1d3SBarry Smithwill not work. You must add `MatSetType()` or `MatSetFromOptions()` before the call to `MatSetValues()`. I.e. 18319b92b1d3SBarry Smith 18329b92b1d3SBarry Smith``` 18339b92b1d3SBarry SmithMat A; 18349b92b1d3SBarry Smith 18359b92b1d3SBarry SmithPetscCall(MatCreate(PETSC_COMM_WORLD, &A)); 18369b92b1d3SBarry Smith 18379b92b1d3SBarry SmithPetscCall(MatSetType(A, MATAIJ)); 18389b92b1d3SBarry Smith/* Will override MatSetType() */ 18399b92b1d3SBarry SmithPetscCall(MatSetFromOptions()); 18409b92b1d3SBarry Smith 18419b92b1d3SBarry SmithPetscCall(MatSetValues(A,....)); 18429b92b1d3SBarry Smith``` 18439b92b1d3SBarry Smith 18449b92b1d3SBarry Smith(split_ownership)= 18459b92b1d3SBarry Smith 18469b92b1d3SBarry Smith### What does error detected in `PetscSplitOwnership()` about "sum of local lengths ...": mean? 18479b92b1d3SBarry Smith 18489b92b1d3SBarry SmithIn a previous call to `VecSetSizes()`, `MatSetSizes()`, `VecCreateXXX()` or 18499b92b1d3SBarry Smith`MatCreateXXX()` you passed in local and global sizes that do not make sense for the 18509b92b1d3SBarry Smithcorrect number of processors. For example if you pass in a local size of 2 and a global 18519b92b1d3SBarry Smithsize of 100 and run on two processors, this cannot work since the sum of the local sizes 18529b92b1d3SBarry Smithis 4, not 100. 18539b92b1d3SBarry Smith 18549b92b1d3SBarry Smith(valgrind)= 18559b92b1d3SBarry Smith 1856*74df5e01SJunchao Zhang### What does "corrupt argument" or "caught signal" Or "SEGV" Or "segmentation violation" Or "bus error" mean? Can I use Valgrind or CUDA Compute Sanitizer to debug memory corruption issues? 18579b92b1d3SBarry Smith 18589b92b1d3SBarry SmithSometimes it can mean an argument to a function is invalid. In Fortran this may be caused 18599b92b1d3SBarry Smithby forgetting to list an argument in the call, especially the final `PetscErrorCode`. 18609b92b1d3SBarry Smith 18619b92b1d3SBarry SmithOtherwise it is usually caused by memory corruption; that is somewhere the code is writing 18629b92b1d3SBarry Smithout of array bounds. To track this down rerun the debug version of the code with the 18639b92b1d3SBarry Smithoption `-malloc_debug`. Occasionally the code may crash only with the optimized version, 18649b92b1d3SBarry Smithin that case run the optimized version with `-malloc_debug`. If you determine the 18659b92b1d3SBarry Smithproblem is from memory corruption you can put the macro CHKMEMQ in the code near the crash 18669b92b1d3SBarry Smithto determine exactly what line is causing the problem. 18679b92b1d3SBarry Smith 1868*74df5e01SJunchao ZhangIf `-malloc_debug` does not help: on NVIDIA CUDA systems you can use <https://docs.nvidia.com/compute-sanitizer/ComputeSanitizer/index.html>, 1869*74df5e01SJunchao Zhangfor example, `compute-sanitizer --tool memcheck [sanitizer_options] app_name [app_options]`. 18709b92b1d3SBarry Smith 18719b92b1d3SBarry SmithIf `-malloc_debug` does not help: on GNU/Linux (not macOS machines) - you can 18729b92b1d3SBarry Smithuse [valgrind](http://valgrind.org). Follow the below instructions: 18739b92b1d3SBarry Smith 18749b92b1d3SBarry Smith1. `configure` PETSc with `--download-mpich --with-debugging` (you can use other MPI implementations but most produce spurious Valgrind messages) 18759b92b1d3SBarry Smith 18769b92b1d3SBarry Smith2. Compile your application code with this build of PETSc. 18779b92b1d3SBarry Smith 18789b92b1d3SBarry Smith3. Run with Valgrind. 18799b92b1d3SBarry Smith 18809b92b1d3SBarry Smith ```console 18819b92b1d3SBarry Smith $ $PETSC_DIR/lib/petsc/bin/petscmpiexec -valgrind -n NPROC PETSCPROGRAMNAME PROGRAMOPTIONS 18829b92b1d3SBarry Smith ``` 18839b92b1d3SBarry Smith 18849b92b1d3SBarry Smith or 18859b92b1d3SBarry Smith 18869b92b1d3SBarry Smith ```console 18879b92b1d3SBarry Smith $ mpiexec -n NPROC valgrind --tool=memcheck -q --num-callers=20 \ 18889b92b1d3SBarry Smith --suppressions=$PETSC_DIR/share/petsc/suppressions/valgrind \ 18899b92b1d3SBarry Smith --log-file=valgrind.log.%p PETSCPROGRAMNAME -malloc off PROGRAMOPTIONS 18909b92b1d3SBarry Smith ``` 18919b92b1d3SBarry Smith 18929b92b1d3SBarry Smith:::{note} 18939b92b1d3SBarry Smith- option `--with-debugging` enables valgrind to give stack trace with additional 18949b92b1d3SBarry Smith source-file:line-number info. 18959b92b1d3SBarry Smith- option `--download-mpich` is valgrind clean, other MPI builds are not valgrind clean. 18969b92b1d3SBarry Smith- when `--download-mpich` is used - mpiexec will be in `$PETSC_ARCH/bin` 18979b92b1d3SBarry Smith- `--log-file=valgrind.log.%p` option tells valgrind to store the output from each 18989b92b1d3SBarry Smith process in a different file \[as %p i.e PID, is different for each MPI process. 18999b92b1d3SBarry Smith- `memcheck` will not find certain array access that violate static array 19009b92b1d3SBarry Smith declarations so if memcheck runs clean you can try the `--tool=exp-ptrcheck` 19019b92b1d3SBarry Smith instead. 19029b92b1d3SBarry Smith::: 19039b92b1d3SBarry Smith 19049b92b1d3SBarry SmithYou might also consider using <http://drmemory.org> which has support for GNU/Linux, Apple 19059b92b1d3SBarry SmithMac OS and Microsoft Windows machines. (Note we haven't tried this ourselves). 19069b92b1d3SBarry Smith 19079b92b1d3SBarry Smith(zeropivot)= 19089b92b1d3SBarry Smith 19099b92b1d3SBarry Smith### What does "detected zero pivot in LU factorization" mean? 19109b92b1d3SBarry Smith 19119b92b1d3SBarry SmithA zero pivot in LU, ILU, Cholesky, or ICC sparse factorization does not always mean that 19129b92b1d3SBarry Smiththe matrix is singular. You can use 19139b92b1d3SBarry Smith 19149b92b1d3SBarry Smith```text 19159b92b1d3SBarry Smith-pc_factor_shift_type nonzero -pc_factor_shift_amount [amount] 19169b92b1d3SBarry Smith``` 19179b92b1d3SBarry Smith 19189b92b1d3SBarry Smithor 19199b92b1d3SBarry Smith 19209b92b1d3SBarry Smith```text 19219b92b1d3SBarry Smith-pc_factor_shift_type positive_definite -[level]_pc_factor_shift_type nonzero 19229b92b1d3SBarry Smith -pc_factor_shift_amount [amount] 19239b92b1d3SBarry Smith``` 19249b92b1d3SBarry Smith 19259b92b1d3SBarry Smithor 19269b92b1d3SBarry Smith 19279b92b1d3SBarry Smith```text 19289b92b1d3SBarry Smith-[level]_pc_factor_shift_type positive_definite 19299b92b1d3SBarry Smith``` 19309b92b1d3SBarry Smith 19319b92b1d3SBarry Smithto prevent the zero pivot. [level] is "sub" when lu, ilu, cholesky, or icc are employed in 19329b92b1d3SBarry Smitheach individual block of the bjacobi or ASM preconditioner. [level] is "mg_levels" or 19339b92b1d3SBarry Smith"mg_coarse" when lu, ilu, cholesky, or icc are used inside multigrid smoothers or to the 19349b92b1d3SBarry Smithcoarse grid solver. See `PCFactorSetShiftType()`, `PCFactorSetShiftAmount()`. 19359b92b1d3SBarry Smith 19369b92b1d3SBarry SmithThis error can also happen if your matrix is singular, see `MatSetNullSpace()` for how 19379b92b1d3SBarry Smithto handle this. If this error occurs in the zeroth row of the matrix, it is likely you 19389b92b1d3SBarry Smithhave an error in the code that generates the matrix. 19399b92b1d3SBarry Smith 19409b92b1d3SBarry Smith### You create draw windows or `PETSCVIEWERDRAW` windows or use options `-ksp_monitor draw::draw_lg` or `-snes_monitor draw::draw_lg` and the program seems to run OK but windows never open 19419b92b1d3SBarry Smith 19429b92b1d3SBarry SmithThe libraries were compiled without support for X windows. Make sure that `configure` 19439b92b1d3SBarry Smithwas run with the option `--with-x`. 19449b92b1d3SBarry Smith 19459b92b1d3SBarry Smith### The program seems to use more and more memory as it runs, even though you don't think you are allocating more memory 19469b92b1d3SBarry Smith 19479b92b1d3SBarry SmithSome of the following may be occurring: 19489b92b1d3SBarry Smith 19499b92b1d3SBarry Smith- You are creating new PETSc objects but never freeing them. 19509b92b1d3SBarry Smith- There is a memory leak in PETSc or your code. 19519b92b1d3SBarry Smith- Something much more subtle: (if you are using Fortran). When you declare a large array 19529b92b1d3SBarry Smith in Fortran, the operating system does not allocate all the memory pages for that array 19539b92b1d3SBarry Smith until you start using the different locations in the array. Thus, in a code, if at each 19549b92b1d3SBarry Smith step you start using later values in the array your virtual memory usage will "continue" 19559b92b1d3SBarry Smith to increase as measured by `ps` or `top`. 19569b92b1d3SBarry Smith- You are running with the `-log`, `-log_mpe`, or `-log_all` option. With these 19579b92b1d3SBarry Smith options, a great deal of logging information is stored in memory until the conclusion of 19589b92b1d3SBarry Smith the run. 19599b92b1d3SBarry Smith- You are linking with the MPI profiling libraries; these cause logging of all MPI 19609b92b1d3SBarry Smith activities. Another symptom is at the conclusion of the run it may print some message 19619b92b1d3SBarry Smith about writing log files. 19629b92b1d3SBarry Smith 19639b92b1d3SBarry SmithThe following may help: 19649b92b1d3SBarry Smith 19659b92b1d3SBarry Smith- Run with the `-malloc_debug` option and `-malloc_view`. Or use `PetscMallocDump()` 19669b92b1d3SBarry Smith and `PetscMallocView()` sprinkled about your code to track memory that is allocated 19679b92b1d3SBarry Smith and not later freed. Use the commands `PetscMallocGetCurrentUsage()` and 19689b92b1d3SBarry Smith `PetscMemoryGetCurrentUsage()` to monitor memory allocated and 19699b92b1d3SBarry Smith `PetscMallocGetMaximumUsage()` and `PetscMemoryGetMaximumUsage()` for total memory 19709b92b1d3SBarry Smith used as the code progresses. 19719b92b1d3SBarry Smith- This is just the way Unix works and is harmless. 19729b92b1d3SBarry Smith- Do not use the `-log`, `-log_mpe`, or `-log_all` option, or use 19739b92b1d3SBarry Smith `PetscLogEventDeactivate()` or `PetscLogEventDeactivateClass()` to turn off logging of 19749b92b1d3SBarry Smith specific events. 19759b92b1d3SBarry Smith- Make sure you do not link with the MPI profiling libraries. 19769b92b1d3SBarry Smith 19779b92b1d3SBarry Smith### When calling `MatPartitioningApply()` you get a message "Error! Key 16615 Not Found" 19789b92b1d3SBarry Smith 19799b92b1d3SBarry SmithThe graph of the matrix you are using is not symmetric. You must use symmetric matrices 19809b92b1d3SBarry Smithfor partitioning. 19819b92b1d3SBarry Smith 19829b92b1d3SBarry Smith### With GMRES at restart the second residual norm printed does not match the first 19839b92b1d3SBarry Smith 19849b92b1d3SBarry SmithI.e. 19859b92b1d3SBarry Smith 19869b92b1d3SBarry Smith```text 19879b92b1d3SBarry Smith26 KSP Residual norm 3.421544615851e-04 19889b92b1d3SBarry Smith27 KSP Residual norm 2.973675659493e-04 19899b92b1d3SBarry Smith28 KSP Residual norm 2.588642948270e-04 19909b92b1d3SBarry Smith29 KSP Residual norm 2.268190747349e-04 19919b92b1d3SBarry Smith30 KSP Residual norm 1.977245964368e-04 19929b92b1d3SBarry Smith30 KSP Residual norm 1.994426291979e-04 <----- At restart the residual norm is printed a second time 19939b92b1d3SBarry Smith``` 19949b92b1d3SBarry Smith 19959b92b1d3SBarry SmithThis is actually not surprising! GMRES computes the norm of the residual at each iteration 19969b92b1d3SBarry Smithvia a recurrence relation between the norms of the residuals at the previous iterations 19979b92b1d3SBarry Smithand quantities computed at the current iteration. It does not compute it via directly 19989b92b1d3SBarry Smith$|| b - A x^{n} ||$. 19999b92b1d3SBarry Smith 20009b92b1d3SBarry SmithSometimes, especially with an ill-conditioned matrix, or computation of the matrix-vector 20019b92b1d3SBarry Smithproduct via differencing, the residual norms computed by GMRES start to "drift" from the 20029b92b1d3SBarry Smithcorrect values. At the restart, we compute the residual norm directly, hence the "strange 20039b92b1d3SBarry Smithstuff," the difference printed. The drifting, if it remains small, is harmless (doesn't 20049b92b1d3SBarry Smithaffect the accuracy of the solution that GMRES computes). 20059b92b1d3SBarry Smith 20069b92b1d3SBarry SmithIf you use a more powerful preconditioner the drift will often be smaller and less 20079b92b1d3SBarry Smithnoticeable. Of if you are running matrix-free you may need to tune the matrix-free 20089b92b1d3SBarry Smithparameters. 20099b92b1d3SBarry Smith 20109b92b1d3SBarry Smith### Why do some Krylov methods seem to print two residual norms per iteration? 20119b92b1d3SBarry Smith 20129b92b1d3SBarry SmithI.e. 20139b92b1d3SBarry Smith 20149b92b1d3SBarry Smith```text 20159b92b1d3SBarry Smith1198 KSP Residual norm 1.366052062216e-04 20169b92b1d3SBarry Smith1198 KSP Residual norm 1.931875025549e-04 20179b92b1d3SBarry Smith1199 KSP Residual norm 1.366026406067e-04 20189b92b1d3SBarry Smith1199 KSP Residual norm 1.931819426344e-04 20199b92b1d3SBarry Smith``` 20209b92b1d3SBarry Smith 20219b92b1d3SBarry SmithSome Krylov methods, for example `KSPTFQMR`, actually have a "sub-iteration" of size 2 20229b92b1d3SBarry Smithinside the loop. Each of the two substeps has its own matrix vector product and 20239b92b1d3SBarry Smithapplication of the preconditioner and updates the residual approximations. This is why you 20249b92b1d3SBarry Smithget this "funny" output where it looks like there are two residual norms per 20259b92b1d3SBarry Smithiteration. You can also think of it as twice as many iterations. 20269b92b1d3SBarry Smith 20279b92b1d3SBarry Smith### Unable to locate PETSc dynamic library `libpetsc` 20289b92b1d3SBarry Smith 20299b92b1d3SBarry SmithWhen using DYNAMIC libraries - the libraries cannot be moved after they are 20309b92b1d3SBarry Smithinstalled. This could also happen on clusters - where the paths are different on the (run) 20319b92b1d3SBarry Smithnodes - than on the (compile) front-end. **Do not use dynamic libraries & shared 20329b92b1d3SBarry Smithlibraries**. Run `configure` with 20339b92b1d3SBarry Smith`--with-shared-libraries=0 --with-dynamic-loading=0`. 20349b92b1d3SBarry Smith 20359b92b1d3SBarry Smith:::{important} 2036f0b74427SPierre JolivetThis option has been removed in PETSc v3.5 20379b92b1d3SBarry Smith::: 20389b92b1d3SBarry Smith 20399b92b1d3SBarry Smith### How do I determine what update to PETSc broke my code? 20409b92b1d3SBarry Smith 20419b92b1d3SBarry SmithIf at some point (in PETSc code history) you had a working code - but the latest PETSc 20429b92b1d3SBarry Smithcode broke it, its possible to determine the PETSc code change that might have caused this 20439b92b1d3SBarry Smithbehavior. This is achieved by: 20449b92b1d3SBarry Smith 20459b92b1d3SBarry Smith- Using Git to access PETSc sources 20469b92b1d3SBarry Smith- Knowing the Git commit for the known working version of PETSc 20479b92b1d3SBarry Smith- Knowing the Git commit for the known broken version of PETSc 20489b92b1d3SBarry Smith- Using the [bisect](https://mirrors.edge.kernel.org/pub/software/scm/git/docs/git-bisect.html) 20499b92b1d3SBarry Smith functionality of Git 20509b92b1d3SBarry Smith 20519b92b1d3SBarry SmithGit bisect can be done as follows: 20529b92b1d3SBarry Smith 20539b92b1d3SBarry Smith1. Get PETSc development (main branch in git) sources 20549b92b1d3SBarry Smith 20559b92b1d3SBarry Smith ```console 20569b92b1d3SBarry Smith $ git clone https://gitlab.com/petsc/petsc.git 20579b92b1d3SBarry Smith ``` 20589b92b1d3SBarry Smith 20599b92b1d3SBarry Smith2. Find the good and bad markers to start the bisection process. This can be done either 20609b92b1d3SBarry Smith by checking `git log` or `gitk` or <https://gitlab.com/petsc/petsc> or the web 20619b92b1d3SBarry Smith history of petsc-release clones. Lets say the known bad commit is 21af4baa815c and 20629b92b1d3SBarry Smith known good commit is 5ae5ab319844. 20639b92b1d3SBarry Smith 20649b92b1d3SBarry Smith3. Start the bisection process with these known revisions. Build PETSc, and test your code 20659b92b1d3SBarry Smith to confirm known good/bad behavior: 20669b92b1d3SBarry Smith 20679b92b1d3SBarry Smith ```console 20689b92b1d3SBarry Smith $ git bisect start 21af4baa815c 5ae5ab319844 20699b92b1d3SBarry Smith ``` 20709b92b1d3SBarry Smith 20719b92b1d3SBarry Smith build/test, perhaps discover that this new state is bad 20729b92b1d3SBarry Smith 20739b92b1d3SBarry Smith ```console 20749b92b1d3SBarry Smith $ git bisect bad 20759b92b1d3SBarry Smith ``` 20769b92b1d3SBarry Smith 20779b92b1d3SBarry Smith build/test, perhaps discover that this state is good 20789b92b1d3SBarry Smith 20799b92b1d3SBarry Smith ```console 20809b92b1d3SBarry Smith $ git bisect good 20819b92b1d3SBarry Smith ``` 20829b92b1d3SBarry Smith 20839b92b1d3SBarry Smith Now until done - keep bisecting, building PETSc, and testing your code with it and 20849b92b1d3SBarry Smith determine if the code is working or not. After something like 5-15 iterations, `git 20859b92b1d3SBarry Smith bisect` will pin-point the exact code change that resulted in the difference in 20869b92b1d3SBarry Smith application behavior. 20879b92b1d3SBarry Smith 20889b92b1d3SBarry Smith:::{tip} 20899b92b1d3SBarry SmithSee [git-bisect(1)](https://mirrors.edge.kernel.org/pub/software/scm/git/docs/git-bisect.html) and the 20909b92b1d3SBarry Smith[debugging section of the Git Book](https://git-scm.com/book/en/Git-Tools-Debugging-with-Git) for more debugging tips. 20919b92b1d3SBarry Smith::: 20929b92b1d3SBarry Smith 20939b92b1d3SBarry Smith### How to fix the error "PMIX Error: error in file gds_ds12_lock_pthread.c"? 20949b92b1d3SBarry Smith 20959b92b1d3SBarry SmithThis seems to be an error when using Open MPI and OpenBLAS with threads (or perhaps other 20969b92b1d3SBarry Smithpackages that use threads). 20979b92b1d3SBarry Smith 20989b92b1d3SBarry Smith```console 20999b92b1d3SBarry Smith$ export PMIX_MCA_gds=hash 21009b92b1d3SBarry Smith``` 21019b92b1d3SBarry Smith 21029b92b1d3SBarry SmithShould resolve the problem. 21039b92b1d3SBarry Smith 21049b92b1d3SBarry Smith______________________________________________________________________ 21059b92b1d3SBarry Smith 21069b92b1d3SBarry Smith(doc_faq_sharedlibs)= 21079b92b1d3SBarry Smith 21089b92b1d3SBarry Smith## Shared Libraries 21099b92b1d3SBarry Smith 21109b92b1d3SBarry Smith### Can I install PETSc libraries as shared libraries? 21119b92b1d3SBarry Smith 21129b92b1d3SBarry SmithYes. Use 21139b92b1d3SBarry Smith 21149b92b1d3SBarry Smith```console 21159b92b1d3SBarry Smith$ ./configure --with-shared-libraries 21169b92b1d3SBarry Smith``` 21179b92b1d3SBarry Smith 21189b92b1d3SBarry Smith### Why should I use shared libraries? 21199b92b1d3SBarry Smith 21209b92b1d3SBarry SmithWhen you link to shared libraries, the function symbols from the shared libraries are not 21219b92b1d3SBarry Smithcopied in the executable. This way the size of the executable is considerably smaller than 21229b92b1d3SBarry Smithwhen using regular libraries. This helps in a couple of ways: 21239b92b1d3SBarry Smith 21249b92b1d3SBarry Smith- Saves disk space when more than one executable is created 21259b92b1d3SBarry Smith- Improves the compile time immensely, because the compiler has to write a much smaller 21269b92b1d3SBarry Smith file (executable) to the disk. 21279b92b1d3SBarry Smith 21289b92b1d3SBarry Smith### How do I link to the PETSc shared libraries? 21299b92b1d3SBarry Smith 21309b92b1d3SBarry SmithBy default, the compiler should pick up the shared libraries instead of the regular 21319b92b1d3SBarry Smithones. Nothing special should be done for this. 21329b92b1d3SBarry Smith 21339b92b1d3SBarry Smith### What if I want to link to the regular `.a` library files? 21349b92b1d3SBarry Smith 21359b92b1d3SBarry SmithYou must run `configure` with the option `--with-shared-libraries=0` (you can use a 21369b92b1d3SBarry Smithdifferent `$PETSC_ARCH` for this build so you can easily switch between the two). 21379b92b1d3SBarry Smith 21389b92b1d3SBarry Smith### What do I do if I want to move my executable to a different machine? 21399b92b1d3SBarry Smith 21409b92b1d3SBarry SmithYou would also need to have access to the shared libraries on this new machine. The other 21419b92b1d3SBarry Smithalternative is to build the executable without shared libraries by first deleting the 21429b92b1d3SBarry Smithshared libraries, and then creating the executable. 21439b92b1d3SBarry Smith 21449b92b1d3SBarry Smith```{bibliography} /petsc.bib 21459b92b1d3SBarry Smith:filter: docname in docnames 21469b92b1d3SBarry Smith``` 2147