1(ch_snes)= 2 3# SNES: Nonlinear Solvers 4 5The solution of large-scale nonlinear problems pervades many facets of 6computational science and demands robust and flexible solution 7strategies. The `SNES` library of PETSc provides a powerful suite of 8data-structure-neutral numerical routines for such problems. Built on 9top of the linear solvers and data structures discussed in preceding 10chapters, `SNES` enables the user to easily customize the nonlinear 11solvers according to the application at hand. Also, the `SNES` 12interface is *identical* for the uniprocess and parallel cases; the only 13difference in the parallel version is that each process typically forms 14only its local contribution to various matrices and vectors. 15 16The `SNES` class includes methods for solving systems of nonlinear 17equations of the form 18 19$$ 20\mathbf{F}(\mathbf{x}) = 0, 21$$ (fx0) 22 23where $\mathbf{F}: \, \Re^n \to \Re^n$. Newton-like methods provide the 24core of the package, including both line search and trust region 25techniques. A suite of nonlinear Krylov methods and methods based upon 26problem decomposition are also included. The solvers are discussed 27further in {any}`sec_nlsolvers`. Following the PETSc design 28philosophy, the interfaces to the various solvers are all virtually 29identical. In addition, the `SNES` software is completely flexible, so 30that the user can at runtime change any facet of the solution process. 31 32PETSc’s default method for solving the nonlinear equation is Newton’s 33method with line search, `SNESNEWTONLS`. The general form of the $n$-dimensional Newton’s method 34for solving {math:numref}`fx0` is 35 36$$ 37\mathbf{x}_{k+1} = \mathbf{x}_k - \mathbf{J}(\mathbf{x}_k)^{-1} \mathbf{F}(\mathbf{x}_k), \;\; k=0,1, \ldots, 38$$ (newton) 39 40where $\mathbf{x}_0$ is an initial approximation to the solution and 41$\mathbf{J}(\mathbf{x}_k) = \mathbf{F}'(\mathbf{x}_k)$, the Jacobian, is nonsingular at each 42iteration. In practice, the Newton iteration {math:numref}`newton` is 43implemented by the following two steps: 44 45$$ 46\begin{aligned} 471. & \text{(Approximately) solve} & \mathbf{J}(\mathbf{x}_k) \Delta \mathbf{x}_k &= -\mathbf{F}(\mathbf{x}_k). \\ 482. & \text{Update} & \mathbf{x}_{k+1} &\gets \mathbf{x}_k + \Delta \mathbf{x}_k. 49\end{aligned} 50$$ 51 52Other defect-correction algorithms can be implemented by using different 53choices for $J(\mathbf{x}_k)$. 54 55(sec_snesusage)= 56 57## Basic SNES Usage 58 59In the simplest usage of the nonlinear solvers, the user must merely 60provide a C, C++, Fortran, or Python routine to evaluate the nonlinear function 61{math:numref}`fx0`. The corresponding Jacobian matrix 62can be approximated with finite differences. For codes that are 63typically more efficient and accurate, the user can provide a routine to 64compute the Jacobian; details regarding these application-provided 65routines are discussed below. To provide an overview of the use of the 66nonlinear solvers, browse the concrete example in {ref}`ex1.c <snes-ex1>` or skip ahead to the discussion. 67 68(snes_ex1)= 69 70:::{admonition} Listing: `src/snes/tutorials/ex1.c` 71```{literalinclude} /../src/snes/tutorials/ex1.c 72:end-before: /*TEST 73``` 74::: 75 76To create a `SNES` solver, one must first call `SNESCreate()` as 77follows: 78 79``` 80SNESCreate(MPI_Comm comm, SNES *snes); 81``` 82 83The user must then set routines for evaluating the residual function {math:numref}`fx0` 84and, *possibly*, its associated Jacobian matrix, as 85discussed in the following sections. 86 87To choose a nonlinear solution method, the user can either call 88 89``` 90SNESSetType(SNES snes, SNESType method); 91``` 92 93or use the option `-snes_type <method>`, where details regarding the 94available methods are presented in {any}`sec_nlsolvers`. The 95application code can take complete control of the linear and nonlinear 96techniques used in the Newton-like method by calling 97 98``` 99SNESSetFromOptions(snes); 100``` 101 102This routine provides an interface to the PETSc options database, so 103that at runtime the user can select a particular nonlinear solver, set 104various parameters and customized routines (e.g., specialized line 105search variants), prescribe the convergence tolerance, and set 106monitoring routines. With this routine the user can also control all 107linear solver options in the `KSP`, and `PC` modules, as discussed 108in {any}`ch_ksp`. 109 110After having set these routines and options, the user solves the problem 111by calling 112 113``` 114SNESSolve(SNES snes, Vec b, Vec x); 115``` 116 117where `x` should be initialized to the initial guess before calling and contains the solution on return. 118In particular, to employ an initial guess of 119zero, the user should explicitly set this vector to zero by calling 120`VecZeroEntries(x)`. Finally, after solving the nonlinear system (or several 121systems), the user should destroy the `SNES` context with 122 123``` 124SNESDestroy(SNES *snes); 125``` 126 127(sec_snesfunction)= 128 129### Nonlinear Function Evaluation 130 131When solving a system of nonlinear equations, the user must provide a 132a residual function {math:numref}`fx0`, which is set using 133 134``` 135SNESSetFunction(SNES snes, Vec f, PetscErrorCode (*FormFunction)(SNES snes, Vec x, Vec f, PetscCtx ctx), PetscCtx ctx); 136``` 137 138The argument `f` is an optional vector for storing the solution; pass `NULL` to have the `SNES` allocate it for you. 139The argument `ctx` is an optional user-defined context, which can 140store any private, application-specific data required by the function 141evaluation routine; `NULL` should be used if such information is not 142needed. In C and C++, a user-defined context is merely a structure in 143which various objects can be stashed; in Fortran a user context can be 144an integer array that contains both parameters and pointers to PETSc 145objects. 146<a href="PETSC_DOC_OUT_ROOT_PLACEHOLDER/src/snes/tutorials/ex5.c.html">SNES Tutorial ex5</a> 147and 148<a href="PETSC_DOC_OUT_ROOT_PLACEHOLDER/src/snes/tutorials/ex5f90.F90.html">SNES Tutorial ex5f90</a> 149give examples of user-defined application contexts in C and Fortran, 150respectively. 151 152(sec_snesjacobian)= 153 154### Jacobian Evaluation 155 156The user may also specify a routine to form some approximation of the 157Jacobian matrix, `A`, at the current iterate, `x`, as is typically 158done with 159 160``` 161SNESSetJacobian(SNES snes, Mat Amat, Mat Pmat, PetscErrorCode (*FormJacobian)(SNES snes, Vec x, Mat A, Mat B, PetscCtx ctx), PetscCtx ctx); 162``` 163 164The arguments of the routine `FormJacobian()` are the current iterate, 165`x`; the (approximate) Jacobian matrix, `Amat`; the matrix from 166which the preconditioner is constructed, `Pmat` (which is usually the 167same as `Amat`); and an optional user-defined Jacobian context, 168`ctx`, for application-specific data. The `FormJacobian()` 169callback is only invoked if the solver requires it, always 170*after* `FormFunction()` has been called at the current iterate. 171 172Note that the `SNES` solvers 173are all data-structure neutral, so the full range of PETSc matrix 174formats (including “matrix-free” methods) can be used. 175{any}`ch_matrices` discusses information regarding 176available matrix formats and options, while {any}`sec_nlmatrixfree` focuses on matrix-free methods in 177`SNES`. We briefly touch on a few details of matrix usage that are 178particularly important for efficient use of the nonlinear solvers. 179 180A common usage paradigm is to assemble the problem Jacobian in the 181preconditioner storage `B`, rather than `A`. In the case where they 182are identical, as in many simulations, this makes no difference. 183However, it allows us to check the analytic Jacobian we construct in 184`FormJacobian()` by passing the `-snes_mf_operator` flag. This 185causes PETSc to approximate the Jacobian using finite differencing of 186the function evaluation (discussed in {any}`sec_fdmatrix`), 187and the analytic Jacobian becomes merely the preconditioner. Even if the 188analytic Jacobian is incorrect, it is likely that the finite difference 189approximation will converge, and thus this is an excellent method to 190verify the analytic Jacobian. Moreover, if the analytic Jacobian is 191incomplete (some terms are missing or approximate), 192`-snes_mf_operator` may be used to obtain the exact solution, where 193the Jacobian approximation has been transferred to the preconditioner. 194 195One such approximate Jacobian comes from “Picard linearization”, use `SNESSetPicard()`, which 196writes the nonlinear system as 197 198$$ 199\mathbf{F}(\mathbf{x}) := \mathbf{A}(\mathbf{x}) \mathbf{x} - \mathbf{b} = 0 200$$ 201 202where $\mathbf{A}(\mathbf{x})$ usually contains the lower-derivative parts of the 203equation. For example, the nonlinear diffusion problem 204 205$$ 206- \nabla\cdot(\kappa(u) \nabla u) = 0 207$$ 208 209would be linearized as 210 211$$ 212A(u) v \simeq -\nabla\cdot(\kappa(u) \nabla v). 213$$ 214 215Usually this linearization is simpler to implement than Newton and the 216linear problems are somewhat easier to solve. In addition to using 217`-snes_mf_operator` with this approximation to the Jacobian, the 218Picard iterative procedure can be performed by defining $\mathbf{J}(\mathbf{x})$ 219to be $\mathbf{A}(\mathbf{x})$. Sometimes this iteration exhibits better global 220convergence than Newton linearization. 221 222During successive calls to `FormJacobian()`, the user can either 223insert new matrix contexts or reuse old ones, depending on the 224application requirements. For many sparse matrix formats, reusing the 225old space (and merely changing the matrix elements) is more efficient; 226however, if the matrix nonzero structure completely changes, creating an 227entirely new matrix context may be preferable. Upon subsequent calls to 228the `FormJacobian()` routine, the user may wish to reinitialize the 229matrix entries to zero by calling `MatZeroEntries()`. See 230{any}`sec_othermat` for details on the reuse of the matrix 231context. 232 233The directory `$PETSC_DIR/src/snes/tutorials` provides a variety of 234examples. 235 236Sometimes a nonlinear solver may produce a step that is not within the domain 237of a given function, for example one with a negative pressure. When this occurs 238one can call `SNESSetFunctionDomainError()` or `SNESSetJacobianDomainError()` 239to indicate to `SNES` the step is not valid. One must also use `SNESGetConvergedReason()` 240and check the reason to confirm if the solver succeeded. See {any}`sec_vi` for how to 241provide `SNES` with bounds on the variables to solve (differential) variational inequalities 242and how to control properties of the line step computed. 243 244## Function Domain Errors and infinity or NaN 245 246Occasionally nonlinear solvers will propose solutions $u$, where the function value (or the objective function set with `SNESSetObjective()`) contains infinity or NaN. 247This can be due to bugs in the application code or because the proposed solution is not in the domain of the function. The application function can call `SNESSetFunctionDomainError()` or 248`SNESSetObjectiveDomainError()` to indicate $u$ is not in the function's domain. 249 250Some `SNESSolve()` implementations (and related `SNESLineSearchApply()` routines) attempt to recover from the infinity or NaN; generally by shrinking the step size. 251If they are unable to recover the `SNESConvergedReason` returned will be `SNES_DIVERGED_FUNCTION_DOMAIN`, `SNES_DIVERGED_OJECTIVE_DOMAIN`, `SNES_DIVERGED_FUNCTION_NANORINF`, or `SNES_DIVERGED_OJECTIVE_NANORINF`. 252 253 254(sec_nlsolvers)= 255 256## The Nonlinear Solvers 257 258As summarized in Table {any}`tab-snesdefaults`, `SNES` includes 259several Newton-like nonlinear solvers based on line search techniques 260and trust region methods. Also provided are several nonlinear Krylov 261methods, as well as nonlinear methods involving decompositions of the 262problem. 263 264Each solver may have associated with it a set of options, which can be 265set with routines and options database commands provided for this 266purpose. A complete list can be found by consulting the manual pages or 267by running a program with the `-help` option; we discuss just a few in 268the sections below. 269 270```{eval-rst} 271.. list-table:: PETSc Nonlinear Solvers 272 :name: tab-snesdefaults 273 :header-rows: 1 274 275 * - Method 276 - SNESType 277 - Options Name 278 - Default Line Search 279 * - Line Search Newton 280 - ``SNESNEWTONLS`` 281 - ``newtonls`` 282 - ``SNESLINESEARCHBT`` 283 * - Trust region Newton 284 - ``SNESNEWTONTR`` 285 - ``newtontr`` 286 - — 287 * - Newton with Arc Length Continuation 288 - ``SNESNEWTONAL`` 289 - ``newtonal`` 290 - — 291 * - Nonlinear Richardson 292 - ``SNESNRICHARDSON`` 293 - ``nrichardson`` 294 - ``SNESLINESEARCHSECANT`` 295 * - Nonlinear CG 296 - ``SNESNCG`` 297 - ``ncg`` 298 - ``SNESLINESEARCHCP`` 299 * - Nonlinear GMRES 300 - ``SNESNGMRES`` 301 - ``ngmres`` 302 - ``SNESLINESEARCHSECANT`` 303 * - Quasi-Newton 304 - ``SNESQN`` 305 - ``qn`` 306 - see :any:`tab-qndefaults` 307 * - Full Approximation Scheme 308 - ``SNESFAS`` 309 - ``fas`` 310 - — 311 * - Nonlinear ASM 312 - ``SNESNASM`` 313 - ``nasm`` 314 - – 315 * - ASPIN 316 - ``SNESASPIN`` 317 - ``aspin`` 318 - ``SNESLINESEARCHBT`` 319 * - Nonlinear Gauss-Seidel 320 - ``SNESNGS`` 321 - ``ngs`` 322 - – 323 * - Anderson Mixing 324 - ``SNESANDERSON`` 325 - ``anderson`` 326 - – 327 * - Newton with constraints (1) 328 - ``SNESVINEWTONRSLS`` 329 - ``vinewtonrsls`` 330 - ``SNESLINESEARCHBT`` 331 * - Newton with constraints (2) 332 - ``SNESVINEWTONSSLS`` 333 - ``vinewtonssls`` 334 - ``SNESLINESEARCHBT`` 335 * - Multi-stage Smoothers 336 - ``SNESMS`` 337 - ``ms`` 338 - – 339 * - Composite 340 - ``SNESCOMPOSITE`` 341 - ``composite`` 342 - – 343 * - Linear solve only 344 - ``SNESKSPONLY`` 345 - ``ksponly`` 346 - – 347 * - Python Shell 348 - ``SNESPYTHON`` 349 - ``python`` 350 - – 351 * - Shell (user-defined) 352 - ``SNESSHELL`` 353 - ``shell`` 354 - – 355 356``` 357 358### Line Search Newton 359 360The method `SNESNEWTONLS` (`-snes_type newtonls`) provides a 361line search Newton method for solving systems of nonlinear equations. By 362default, this technique employs cubic backtracking 363{cite}`dennis:83`. Alternative line search techniques are 364listed in Table {any}`tab-linesearches`. 365 366```{eval-rst} 367.. table:: PETSc Line Search Methods 368 :name: tab-linesearches 369 370 ==================== =========================== ================ 371 **Line Search** **SNESLineSearchType** **Options Name** 372 ==================== =========================== ================ 373 Backtracking ``SNESLINESEARCHBT`` ``bt`` 374 (damped) step ``SNESLINESEARCHBASIC`` ``basic`` 375 identical to above ``SNESLINESEARCHNONE`` ``none`` 376 Secant method ``SNESLINESEARCHSECANT`` ``secant`` 377 Critical point ``SNESLINESEARCHCP`` ``cp`` 378 Error-oriented ``SNESLINESEARCHNLEQERR`` ``nleqerr`` 379 Bisection ``SNESLINESEARCHBISECTION`` ``bisection`` 380 Shell ``SNESLINESEARCHSHELL`` ``shell`` 381 ==================== =========================== ================ 382``` 383 384Every `SNES` has a line search context of type `SNESLineSearch` that 385may be retrieved using 386 387``` 388SNESGetLineSearch(SNES snes, SNESLineSearch *ls);. 389``` 390 391There are several default options for the line searches. The order of 392polynomial approximation may be set with `-snes_linesearch_order` or 393 394``` 395SNESLineSearchSetOrder(SNESLineSearch ls, PetscInt order); 396``` 397 398for instance, 2 for quadratic or 3 for cubic. Sometimes, it may not be 399necessary to monitor the progress of the nonlinear iteration. In this 400case, `-snes_linesearch_norms` or 401 402``` 403SNESLineSearchSetComputeNorms(SNESLineSearch ls, PetscBool norms); 404``` 405 406may be used to turn off function, step, and solution norm computation at 407the end of the linesearch. 408 409The default line search for the line search Newton method, 410`SNESLINESEARCHBT` involves several parameters, which are set to 411defaults that are reasonable for many applications. The user can 412override the defaults by using the following options: 413 414- `-snes_linesearch_alpha <alpha>` 415- `-snes_linesearch_maxstep <max>` 416- `-snes_linesearch_minlambda <tol>` 417 418Besides the backtracking linesearch, there are `SNESLINESEARCHSECANT`, 419which uses a polynomial secant minimization of $||F(x)||_2$ or an objective function 420if set, and `SNESLINESEARCHCP`, which minimizes $F(x) \cdot Y$ where 421$Y$ is the search direction. These are both potentially iterative 422line searches, which may be used to find a better-fitted steplength in 423the case where a single secant search is not sufficient. The number of 424iterations may be set with `-snes_linesearch_max_it`. In addition, the 425convergence criteria of the iterative line searches may be set using 426function tolerances `-snes_linesearch_rtol` and 427`-snes_linesearch_atol`, and steplength tolerance 428`snes_linesearch_ltol`. 429 430For highly non-linear problems, the bisection line search `SNESLINESEARCHBISECTION` 431may prove useful due to its robustness. Similar to the critical point line search 432`SNESLINESEARCHCP`, it seeks to find the root of $F(x) \cdot Y$. 433While the latter does so through a secant method, the bisection line search 434does so by iteratively bisecting the step length interval. 435It works as follows (with $f(\lambda)=F(x-\lambda Y) \cdot Y / ||Y||$ for brevity): 436 4371. initialize: $j=1$, $\lambda_0 = \lambda_{\text{left}} = 0.0$, $\lambda_j = \lambda_{\text{right}} = \alpha$, compute $f(\lambda_0)$ and $f(\lambda_j)$ 438 4392. check whether there is a change of sign in the interval: $f(\lambda_{\text{left}}) f(\lambda_j) \leq 0$; if not accept the full step length $\lambda_1$ 440 4413. if there is a change of sign, enter iterative bisection procedure 442 443 1. check convergence/ exit criteria: 444 445 - absolute tolerance $f(\lambda_j) < \mathtt{atol}$ 446 - relative tolerance $f(\lambda_j) < \mathtt{rtol} \cdot f(\lambda_0)$ 447 - change of step length $\lambda_j - \lambda_{j-1} < \mathtt{ltol}$ 448 - number of iterations $j < \mathtt{max\_it}$ 449 450 2. if $j > 1$, determine direction of bisection 451 452 $$ 453 \begin{aligned}\lambda_{\text{left}} &= \begin{cases}\lambda_{\text{left}} &f(\lambda_{\text{left}}) f(\lambda_j) \leq 0\\\lambda_{j} &\text{else}\\ \end{cases}\\ \lambda_{\text{right}} &= \begin{cases} \lambda_j &f(\lambda_{\text{left}}) f(\lambda_j) \leq 0\\\lambda_{\text{right}} &\text{else}\\ \end{cases}\\\end{aligned} 454 $$ 455 456 3. bisect the interval: $\lambda_{j+1} = (\lambda_{\text{left}} + \lambda_{\text{right}})/2$, compute $f(\lambda_{j+1})$ 457 4. update variables for the next iteration: $\lambda_j \gets \lambda_{j+1}$, $f(\lambda_j) \gets f(\lambda_{j+1})$, $j \gets j+1$ 458 459Custom line search types may either be defined using 460`SNESLineSearchShell`, or by creating a custom user line search type 461in the model of the preexisting ones and register it using 462 463``` 464SNESLineSearchRegister(const char sname[], PetscErrorCode (*function)(SNESLineSearch));. 465``` 466 467### Trust Region Methods 468 469The trust region method in `SNES` for solving systems of nonlinear 470equations, `SNESNEWTONTR` (`-snes_type newtontr`), is similar to the one developed in the 471MINPACK project {cite}`more84`. Several parameters can be 472set to control the variation of the trust region size during the 473solution process. In particular, the user can control the initial trust 474region radius, computed by 475 476$$ 477\Delta = \Delta_0 \| F_0 \|_2, 478$$ 479 480by setting $\Delta_0$ via the option `-snes_tr_delta0 <delta0>`. 481 482### Newton with Arc Length Continuation 483 484The Newton method with arc length continuation reformulates the linearized system 485$K\delta \mathbf x = -\mathbf F(\mathbf x)$ by introducing the load parameter 486$\lambda$ and splitting the residual into two components, commonly 487corresponding to internal and external forces: 488 489$$ 490\mathbf F(x, \lambda) = \mathbf F^{\mathrm{int}}(\mathbf x) - \mathbf F^{\mathrm{ext}}(\mathbf x, \lambda) 491$$ 492 493Often, $\mathbf F^{\mathrm{ext}}(\mathbf x, \lambda)$ is linear in $\lambda$, 494which can be thought of as applying the external force in proportional load 495increments. By default, this is how the right-hand side vector is handled in the 496implemented method. Generally, however, $\mathbf F^{\mathrm{ext}}(\mathbf x, \lambda)$ 497may depend non-linearly on $\lambda$ or $\mathbf x$, or both. 498To accommodate this possibility, we provide the `SNESNewtonALGetLoadParameter()` 499function, which allows for the current value of $\lambda$ to be queried in the 500functions provided to `SNESSetFunction()` and `SNESSetJacobian()`. 501 502Additionally, we split the solution update into two components: 503 504$$ 505\delta \mathbf x = \delta s\delta\mathbf x^F + \delta\lambda\delta\mathbf x^Q, 506$$ 507 508where $\delta s = 1$ unless partial corrections are used (discussed more 509below). Each of $\delta \mathbf x^F$ and $\delta \mathbf x^Q$ are found via 510solving a linear system with the Jacobian $K$: 511 512- $\delta \mathbf x^F$ is the full Newton step for a given value of $\lambda$: $K \delta \mathbf x^F = -\mathbf F(\mathbf x, \lambda)$ 513- $\delta \mathbf x^Q$ is the variation in $\mathbf x$ with respect to $\lambda$, computed by $K \delta\mathbf x^Q = \mathbf Q(\mathbf x, \lambda)$, where $\mathbf Q(\mathbf x, \lambda) = -\partial \mathbf F (\mathbf x, \lambda) / \partial \lambda$ is the tangent load vector. 514 515Often, the tangent load vector $\mathbf Q$ is constant within a load increment, 516which corresponds to the case of proportional loading discussed above. By default, 517$\mathbf Q$ is the full right-hand-side vector, if one was provided. 518The user can also provide a function which computes $\mathbf Q$ to 519`SNESNewtonALSetFunction()`. This function should have the same signature as for 520`SNESSetFunction`, and the user should use `SNESNewtonALGetLoadParameter()` to get 521$\lambda$ if it is needed. 522 523**The Constraint Surface.** Considering the $n+1$ dimensional space of 524$\mathbf x$ and $\lambda$, we define the linearized equilibrium line to be 525the set of points for which the linearized equilibrium equations are satisfied. 526Given the previous iterative solution 527$\mathbf t^{(j-1)} = [\mathbf x^{(j-1)}, \lambda^{(j-1)}]$, 528this line is defined by the point $\mathbf t^{(j-1)} + [\delta\mathbf x^F, 0]$ and 529the vector $\mathbf t^Q [\delta\mathbf x^Q, 1]$. 530The arc length method seeks the intersection of this linearized equilibrium line 531with a quadratic constraint surface, defined by 532 533% math::L^2 = \|\Delta x\|^2 + \psi^2 (\Delta\lambda)^2, 534 535where $L$ is a user-provided step size corresponding to the radius of the 536constraint surface, $\Delta\mathbf x$ and $\Delta\lambda$ are the 537accumulated updates over the current load step, and $\psi^2$ is a 538user-provided consistency parameter determining the shape of the constraint surface. 539Generally, $\psi^2 > 0$ leads to a hyper-sphere constraint surface, while 540$\psi^2 = 0$ leads to a hyper-cylinder constraint surface. 541 542Since the solution will always fall on the constraint surface, the method will often 543require multiple incremental steps to fully solve the non-linear problem. 544This is necessary to accurately trace the equilibrium path. 545Importantly, this is fundamentally different from time stepping. 546While a similar process could be implemented as a `TS`, this method is 547particularly designed to be used as a SNES, either standalone or within a `TS`. 548 549To this end, by default, the load parameter is used such that the full external 550forces are applied at $\lambda = 1$, although we allow for the user to specify 551a different value via `-snes_newtonal_lambda_max`. 552To ensure that the solution corresponds exactly to the external force prescribed by 553the user, i.e. that the load parameter is exactly $\lambda_{max}$ at the end 554of the SNES solve, we clamp the value before computing the solution update. 555As such, the final increment will likely be a hybrid of arc length continuation and 556normal Newton iterations. 557 558**Choosing the Continuation Step.** For the first iteration from an equilibrium 559point, there is a single correct way to choose $\delta\lambda$, which follows 560from the constraint equations. Specifically the constraint equations yield the 561quadratic equation $a\delta\lambda^2 + b\delta\lambda + c = 0$, where 562 563$$ 564\begin{aligned} 565a &= \|\delta\mathbf x^Q\|^2 + \psi^2,\\ 566b &= 2\delta\mathbf x^Q\cdot (\Delta\mathbf x + \delta s\delta\mathbf x^F) + 2\psi^2 \Delta\lambda,\\ 567c &= \|\Delta\mathbf x + \delta s\delta\mathbf x^F\|^2 + \psi^2 \Delta\lambda^2 - L^2. 568\end{aligned} 569$$ 570 571Since in the first iteration, $\Delta\mathbf x = \delta\mathbf x^F = \mathbf 0$ and 572$\Delta\lambda = 0$, $b = 0$ and the equation simplifies to a pair of 573real roots: 574 575$$ 576\delta\lambda = \pm\frac{L}{\sqrt{\|\delta\mathbf x^Q\|^2 + \psi^2}}, 577$$ 578 579where the sign is positive for the first increment and is determined by the previous 580increment otherwise as 581 582$$ 583\text{sign}(\delta\lambda) = \text{sign}\big(\delta\mathbf x^Q \cdot (\Delta\mathbf x)_{i-1} + \psi^2(\Delta\lambda)_{i-1}\big), 584$$ 585 586where $(\Delta\mathbf x)_{i-1}$ and $(\Delta\lambda)_{i-1}$ are the 587accumulated updates over the previous load step. 588 589In subsequent iterations, there are different approaches to selecting 590$\delta\lambda$, all of which have trade-offs. 591The main difference is whether the iterative solution falls on the constraint 592surface at every iteration, or only when fully converged. 593PETSc implements two approaches, set via 594`SNESNewtonALSetCorrectionType()` or 595`-snes_newtonal_correction_type <normal|exact>` on the command line. 596 597**Corrections in the Normal Hyperplane.** The `SNES_NEWTONAL_CORRECTION_NORMAL` 598option is simpler and computationally less expensive, but may fail to converge, as 599the constraint equation is not satisfied at every iteration. 600The update $\delta \lambda$ is chosen such that the update is within the 601normal hyper-surface to the quadratic constraint surface. 602Mathematically, that is 603 604$$ 605\delta \lambda = -\frac{\Delta \mathbf x \cdot \delta \mathbf x^F}{\Delta\mathbf x \cdot \delta\mathbf x^Q + \psi^2 \Delta\lambda}. 606$$ 607 608This implementation is based on {cite}`LeonPaulinoPereiraMenezesLages_2011`. 609 610**Exact Corrections.** The `SNES_NEWTONAL_CORRECTION_EXACT` option is far more 611complex, but ensures that the constraint is exactly satisfied at every Newton 612iteration. As such, it is generally more robust. 613By evaluating the intersection of constraint surface and equilibrium line at each 614iteration, $\delta\lambda$ is chosen as one of the roots of the above 615quadratic equation $a\delta\lambda^2 + b\delta\lambda + c = 0$. 616This method encounters issues, however, if the linearized equilibrium line and 617constraint surface do not intersect due to particularly large linearized error. 618In this case, the roots are complex. 619To continue progressing toward a solution, this method uses a partial correction by 620choosing $\delta s$ such that the quadratic equation has a single real root. 621Geometrically, this is selecting the point on the constraint surface closest to the 622linearized equilibrium line. See the code or {cite}`Ritto-CorreaCamotim2008` for a 623mathematical description of these partial corrections. 624 625### Nonlinear Krylov Methods 626 627A number of nonlinear Krylov methods are provided, including Nonlinear 628Richardson (`SNESNRICHARDSON`), nonlinear conjugate gradient (`SNESNCG`), nonlinear GMRES (`SNESNGMRES`), and Anderson Mixing (`SNESANDERSON`). These 629methods are described individually below. They are all instrumental to 630PETSc’s nonlinear preconditioning. 631 632**Nonlinear Richardson.** The nonlinear Richardson iteration, `SNESNRICHARDSON`, merely 633takes the form of a line search-damped fixed-point iteration of the form 634 635$$ 636\mathbf{x}_{k+1} = \mathbf{x}_k - \lambda \mathbf{F}(\mathbf{x}_k), \;\; k=0,1, \ldots, 637$$ 638 639where the default linesearch is `SNESLINESEARCHSECANT`. This simple solver 640is mostly useful as a nonlinear smoother, or to provide line search 641stabilization to an inner method. 642 643**Nonlinear Conjugate Gradients.** Nonlinear CG, `SNESNCG`, is equivalent to linear 644CG, but with the steplength determined by line search 645(`SNESLINESEARCHCP` by default). Five variants (Fletcher-Reed, 646Hestenes-Steifel, Polak-Ribiere-Polyak, Dai-Yuan, and Conjugate Descent) 647are implemented in PETSc and may be chosen using 648 649``` 650SNESNCGSetType(SNES snes, SNESNCGType btype); 651``` 652 653**Anderson Mixing and Nonlinear GMRES Methods.** Nonlinear GMRES (`SNESNGMRES`), and 654Anderson Mixing (`SNESANDERSON`) methods combine the last $m$ iterates, plus a new 655fixed-point iteration iterate, into an approximate residual-minimizing new iterate. 656 657All of the above methods have support for using a nonlinear preconditioner to compute the preliminary update step, rather than the default 658which is the nonlinear function's residual, \$ mathbf\{F}(mathbf\{x}\_k)\$. The different update is obtained by solving a nonlinear preconditioner nonlinear problem, which has its own 659`SNES` object that may be obtained with `SNESGetNPC()`. 660Quasi-Newton Methods 661^^^^^^^^^^^^^^^^^^^^ 662 663Quasi-Newton methods store iterative rank-one updates to the Jacobian 664instead of computing the Jacobian directly. Three limited-memory quasi-Newton 665methods are provided, L-BFGS, which are described in 666Table {any}`tab-qndefaults`. These all are encapsulated under 667`-snes_type qn` and may be changed with `snes_qn_type`. The default 668is L-BFGS, which provides symmetric updates to an approximate Jacobian. 669This iteration is similar to the line search Newton methods. 670 671The quasi-Newton methods support the use of a nonlinear preconditioner that can be obtained with `SNESGetNPC()` and then configured; or that can be configured with 672`SNES`, `KSP`, and `PC` options using the options database prefix `-npc_`. 673 674```{eval-rst} 675.. list-table:: PETSc quasi-Newton solvers 676 :name: tab-qndefaults 677 :header-rows: 1 678 679 * - QN Method 680 - ``SNESQNType`` 681 - Options Name 682 - Default Line Search 683 * - L-BFGS 684 - ``SNES_QN_LBFGS`` 685 - ``lbfgs`` 686 - ``SNESLINESEARCHCP`` 687 * - “Good” Broyden 688 - ``SNES_QN_BROYDEN`` 689 - ``broyden`` 690 - ``SNESLINESEARCHBASIC`` (or equivalently ``SNESLINESEARCHNONE`` 691 * - “Bad” Broyden 692 - ``SNES_QN_BADBROYDEN`` 693 - ``badbroyden`` 694 - ``SNESLINESEARCHSECANT`` 695``` 696 697One may also control the form of the initial Jacobian approximation with 698 699``` 700SNESQNSetScaleType(SNES snes, SNESQNScaleType stype); 701``` 702 703and the restart type with 704 705``` 706SNESQNSetRestartType(SNES snes, SNESQNRestartType rtype); 707``` 708 709### The Full Approximation Scheme 710 711The Nonlinear Full Approximation Scheme (FAS) `SNESFAS`, is a nonlinear multigrid method. At 712each level, there is a recursive cycle control `SNES` instance, and 713either one or two nonlinear solvers that act as smoothers (up and down). Problems 714set up using the `SNES` `DMDA` interface are automatically 715coarsened. FAS, `SNESFAS`, differs slightly from linear multigrid `PCMG`, in that the hierarchy is 716constructed recursively. However, much of the interface is a one-to-one 717map. We describe the “get” operations here, and it can be assumed that 718each has a corresponding “set” operation. For instance, the number of 719levels in the hierarchy may be retrieved using 720 721``` 722SNESFASGetLevels(SNES snes, PetscInt *levels); 723``` 724 725There are four `SNESFAS` cycle types, `SNES_FAS_MULTIPLICATIVE`, 726`SNES_FAS_ADDITIVE`, `SNES_FAS_FULL`, and `SNES_FAS_KASKADE`. The 727type may be set with 728 729``` 730SNESFASSetType(SNES snes, SNESFASType fastype);. 731``` 732 733and the cycle type, 1 for V, 2 for W, may be set with 734 735``` 736SNESFASSetCycles(SNES snes, PetscInt cycles);. 737``` 738 739Much like the interface to `PCMG` described in {any}`sec_mg`, there are interfaces to recover the 740various levels’ cycles and smoothers. The level smoothers may be 741accessed with 742 743``` 744SNESFASGetSmoother(SNES snes, PetscInt level, SNES *smooth); 745SNESFASGetSmootherUp(SNES snes, PetscInt level, SNES *smooth); 746SNESFASGetSmootherDown(SNES snes, PetscInt level, SNES *smooth); 747``` 748 749and the level cycles with 750 751``` 752SNESFASGetCycleSNES(SNES snes, PetscInt level, SNES *lsnes);. 753``` 754 755Also akin to `PCMG`, the restriction and prolongation at a level may 756be acquired with 757 758``` 759SNESFASGetInterpolation(SNES snes, PetscInt level, Mat *mat); 760SNESFASGetRestriction(SNES snes, PetscInt level, Mat *mat); 761``` 762 763In addition, FAS requires special restriction for solution-like 764variables, called injection. This may be set with 765 766``` 767SNESFASGetInjection(SNES snes, PetscInt level, Mat *mat);. 768``` 769 770The coarse solve context may be acquired with 771 772``` 773SNESFASGetCoarseSolve(SNES snes, SNES *smooth); 774``` 775 776### Nonlinear Additive Schwarz 777 778Nonlinear Additive Schwarz methods (NASM) take a number of local 779nonlinear subproblems, solves them independently in parallel, and 780combines those solutions into a new approximate solution. 781 782``` 783SNESNASMSetSubdomains(SNES snes, PetscInt n, SNES subsnes[], VecScatter iscatter[], VecScatter oscatter[], VecScatter gscatter[]); 784``` 785 786allows for the user to create these local subdomains. Problems set up 787using the `SNES` `DMDA` interface are automatically decomposed. To 788begin, the type of subdomain updates to the whole solution are limited 789to two types borrowed from `PCASM`: `PC_ASM_BASIC`, in which the 790overlapping updates added. `PC_ASM_RESTRICT` updates in a 791nonoverlapping fashion. This may be set with 792 793``` 794SNESNASMSetType(SNES snes, PCASMType type);. 795``` 796 797`SNESASPIN` is a helper `SNES` type that sets up a nonlinearly 798preconditioned Newton’s method using NASM as the preconditioner. 799 800## General Options 801 802This section discusses options and routines that apply to all `SNES` 803solvers and problem classes. In particular, we focus on convergence 804tests, monitoring routines, and tools for checking derivative 805computations. 806 807(sec_snesconvergence)= 808 809### Convergence Tests 810 811Convergence of the nonlinear solvers can be detected in a variety of 812ways; the user can even specify a customized test, as discussed below. 813Most of the nonlinear solvers use `SNESConvergenceTestDefault()`, 814however, `SNESNEWTONTR` uses a method-specific additional convergence 815test as well. The convergence tests involves several parameters, which 816are set by default to values that should be reasonable for a wide range 817of problems. The user can customize the parameters to the problem at 818hand by using some of the following routines and options. 819 820One method of convergence testing is to declare convergence when the 821norm of the change in the solution between successive iterations is less 822than some tolerance, `stol`. Convergence can also be determined based 823on the norm of the function. Such a test can use either the absolute 824size of the norm, `atol`, or its relative decrease, `rtol`, from an 825initial guess. The following routine sets these parameters, which are 826used in many of the default `SNES` convergence tests: 827 828``` 829SNESSetTolerances(SNES snes, PetscReal atol, PetscReal rtol, PetscReal stol, PetscInt its, PetscInt fcts); 830``` 831 832This routine also sets the maximum numbers of allowable nonlinear 833iterations, `its`, and function evaluations, `fcts`. The 834corresponding options database commands for setting these parameters are: 835 836- `-snes_atol <atol>` 837- `-snes_rtol <rtol>` 838- `-snes_stol <stol>` 839- `-snes_max_it <its>` 840- `-snes_max_funcs <fcts>` (use `unlimited` for no maximum) 841 842A related routine is `SNESGetTolerances()`. `PETSC_CURRENT` may be used 843for any parameter to indicate the current value should be retained; use `PETSC_DETERMINE` to restore to the default value from when the object was created. 844 845Users can set their own customized convergence tests in `SNES` by 846using the command 847 848``` 849SNESSetConvergenceTest(SNES snes, PetscErrorCode (*test)(SNES snes, PetscInt it, PetscReal xnorm, PetscReal gnorm, PetscReal f, SNESConvergedReason reason, PetscCtx cctx), PetscCtx cctx, PetscCtxDestroyFn *destroy); 850``` 851 852The final argument of the convergence test routine, `cctx`, denotes an 853optional user-defined context for private data. When solving systems of 854nonlinear equations, the arguments `xnorm`, `gnorm`, and `f` are 855the current iterate norm, current step norm, and function norm, 856respectively. `SNESConvergedReason` should be set positive for 857convergence and negative for divergence. See `include/petscsnes.h` for 858a list of values for `SNESConvergedReason`. 859 860(sec_snesmonitor)= 861 862### Convergence Monitoring 863 864By default the `SNES` solvers run silently without displaying 865information about the iterations. The user can initiate monitoring with 866the command 867 868``` 869SNESMonitorSet(SNES snes, PetscErrorCode (*mon)(SNES snes, PetscInt its, PetscReal norm, PetscCtx mctx), PetscCtx mctx, (PetscCtxDestroyFn *)*monitordestroy); 870``` 871 872The routine, `mon`, indicates a user-defined monitoring routine, where 873`its` and `mctx` respectively denote the iteration number and an 874optional user-defined context for private data for the monitor routine. 875The argument `norm` is the function norm. 876 877The routine set by `SNESMonitorSet()` is called once after every 878successful step computation within the nonlinear solver. Hence, the user 879can employ this routine for any application-specific computations that 880should be done after the solution update. The option `-snes_monitor` 881activates the default `SNES` monitor routine, 882`SNESMonitorDefault()`, while `-snes_monitor_lg_residualnorm` draws 883a simple line graph of the residual norm’s convergence. 884 885One can cancel hardwired monitoring routines for `SNES` at runtime 886with `-snes_monitor_cancel`. 887 888As the Newton method converges so that the residual norm is small, say 889$10^{-10}$, many of the final digits printed with the 890`-snes_monitor` option are meaningless. Worse, they are different on 891different machines; due to different round-off rules used by, say, the 892IBM RS6000 and the Sun SPARC. This makes testing between different 893machines difficult. The option `-snes_monitor_short` causes PETSc to 894print fewer of the digits of the residual norm as it gets smaller; thus 895on most of the machines it will always print the same numbers making 896cross-process testing easier. 897 898The routines 899 900``` 901SNESGetSolution(SNES snes, Vec *x); 902SNESGetFunction(SNES snes, Vec *r, PetscCtxRt ctx, int(**func)(SNES, Vec, Vec, PetscCtx)); 903``` 904 905return the solution vector and function vector from a `SNES` context. 906These routines are useful, for instance, if the convergence test 907requires some property of the solution or function other than those 908passed with routine arguments. 909 910(sec_snesderivs)= 911 912### Checking Accuracy of Derivatives 913 914Since hand-coding routines for Jacobian matrix evaluation can be error 915prone, `SNES` provides easy-to-use support for checking these matrices 916against finite difference versions. In the simplest form of comparison, 917users can employ the option `-snes_test_jacobian` to compare the 918matrices at several points. Although not exhaustive, this test will 919generally catch obvious problems. One can compare the elements of the 920two matrices by using the option `-snes_test_jacobian_view` , which 921causes the two matrices to be printed to the screen. 922 923Another means for verifying the correctness of a code for Jacobian 924computation is running the problem with either the finite difference or 925matrix-free variant, `-snes_fd` or `-snes_mf`; see {any}`sec_fdmatrix` or {any}`sec_nlmatrixfree`. 926If a 927problem converges well with these matrix approximations but not with a 928user-provided routine, the problem probably lies with the hand-coded 929matrix. See the note in {any}`sec_snesjacobian` about 930assembling your Jabobian in the "preconditioner" slot `Pmat`. 931 932The correctness of user provided `MATSHELL` Jacobians in general can be 933checked with `MatShellTestMultTranspose()` and `MatShellTestMult()`. 934 935The correctness of user provided `MATSHELL` Jacobians via `TSSetRHSJacobian()` 936can be checked with `TSRHSJacobianTestTranspose()` and `TSRHSJacobianTest()` 937that check the correction of the matrix-transpose vector product and the 938matrix-product. From the command line, these can be checked with 939 940- `-ts_rhs_jacobian_test_mult_transpose` 941- `-mat_shell_test_mult_transpose_view` 942- `-ts_rhs_jacobian_test_mult` 943- `-mat_shell_test_mult_view` 944 945## Inexact Newton-like Methods 946 947Since exact solution of the linear Newton systems within {math:numref}`newton` 948at each iteration can be costly, modifications 949are often introduced that significantly reduce these expenses and yet 950retain the rapid convergence of Newton’s method. Inexact or truncated 951Newton techniques approximately solve the linear systems using an 952iterative scheme. In comparison with using direct methods for solving 953the Newton systems, iterative methods have the virtue of requiring 954little space for matrix storage and potentially saving significant 955computational work. Within the class of inexact Newton methods, of 956particular interest are Newton-Krylov methods, where the subsidiary 957iterative technique for solving the Newton system is chosen from the 958class of Krylov subspace projection methods. Note that at runtime the 959user can set any of the linear solver options discussed in {any}`ch_ksp`, 960such as `-ksp_type <ksp_method>` and 961`-pc_type <pc_method>`, to set the Krylov subspace and preconditioner 962methods. 963 964Two levels of iterations occur for the inexact techniques, where during 965each global or outer Newton iteration a sequence of subsidiary inner 966iterations of a linear solver is performed. Appropriate control of the 967accuracy to which the subsidiary iterative method solves the Newton 968system at each global iteration is critical, since these inner 969iterations determine the asymptotic convergence rate for inexact Newton 970techniques. While the Newton systems must be solved well enough to 971retain fast local convergence of the Newton’s iterates, use of excessive 972inner iterations, particularly when $\| \mathbf{x}_k - \mathbf{x}_* \|$ is large, 973is neither necessary nor economical. Thus, the number of required inner 974iterations typically increases as the Newton process progresses, so that 975the truncated iterates approach the true Newton iterates. 976 977A sequence of nonnegative numbers $\{\eta_k\}$ can be used to 978indicate the variable convergence criterion. In this case, when solving 979a system of nonlinear equations, the update step of the Newton process 980remains unchanged, and direct solution of the linear system is replaced 981by iteration on the system until the residuals 982 983$$ 984\mathbf{r}_k^{(i)} = \mathbf{F}'(\mathbf{x}_k) \Delta \mathbf{x}_k + \mathbf{F}(\mathbf{x}_k) 985$$ 986 987satisfy 988 989$$ 990\frac{ \| \mathbf{r}_k^{(i)} \| }{ \| \mathbf{F}(\mathbf{x}_k) \| } \leq \eta_k \leq \eta < 1. 991$$ 992 993Here $\mathbf{x}_0$ is an initial approximation of the solution, and 994$\| \cdot \|$ denotes an arbitrary norm in $\Re^n$ . 995 996By default a constant relative convergence tolerance is used for solving 997the subsidiary linear systems within the Newton-like methods of 998`SNES`. When solving a system of nonlinear equations, one can instead 999employ the techniques of Eisenstat and Walker {cite}`ew96` 1000to compute $\eta_k$ at each step of the nonlinear solver by using 1001the option `-snes_ksp_ew` . In addition, by adding one’s own 1002`KSP` convergence test (see {any}`sec_convergencetests`), one can easily create one’s own, 1003problem-dependent, inner convergence tests. 1004 1005(sec_nlmatrixfree)= 1006 1007## Matrix-Free Methods 1008 1009The `SNES` class fully supports matrix-free methods. The matrices 1010specified in the Jacobian evaluation routine need not be conventional 1011matrices; instead, they can point to the data required to implement a 1012particular matrix-free method. The matrix-free variant is allowed *only* 1013when the linear systems are solved by an iterative method in combination 1014with no preconditioning (`PCNONE` or `-pc_type` `none`), a 1015user-provided matrix from which to construct the preconditioner, or a user-provided preconditioner 1016shell (`PCSHELL`, discussed in {any}`sec_pc`); that 1017is, obviously matrix-free methods cannot be used with a direct solver, 1018approximate factorization, or other preconditioner which requires access 1019to explicit matrix entries. 1020 1021The user can create a matrix-free context for use within `SNES` with 1022the routine 1023 1024``` 1025MatCreateSNESMF(SNES snes, Mat *mat); 1026``` 1027 1028This routine creates the data structures needed for the matrix-vector 1029products that arise within Krylov space iterative 1030methods {cite}`brownsaad:90`. 1031The default `SNES` 1032matrix-free approximations can also be invoked with the command 1033`-snes_mf`. Or, one can retain the user-provided Jacobian 1034preconditioner, but replace the user-provided Jacobian matrix with the 1035default matrix-free variant with the option `-snes_mf_operator`. 1036 1037`MatCreateSNESMF()` uses 1038 1039``` 1040MatCreateMFFD(Vec x, Mat *mat); 1041``` 1042 1043which can also be used directly for users who need a matrix-free matrix but are not using `SNES`. 1044 1045The user can set one parameter to control the Jacobian-vector product 1046approximation with the command 1047 1048``` 1049MatMFFDSetFunctionError(Mat mat, PetscReal rerror); 1050``` 1051 1052The parameter `rerror` should be set to the square root of the 1053relative error in the function evaluations, $e_{rel}$; the default 1054is the square root of machine epsilon (about $10^{-8}$ in double 1055precision), which assumes that the functions are evaluated to full 1056floating-point precision accuracy. This parameter can also be set from 1057the options database with `-mat_mffd_err <err>` 1058 1059In addition, PETSc provides ways to register new routines to compute 1060the differencing parameter ($h$); see the manual page for 1061`MatMFFDSetType()` and `MatMFFDRegister()`. We currently provide two 1062default routines accessible via `-mat_mffd_type <ds or wp>`. For 1063the default approach there is one “tuning” parameter, set with 1064 1065``` 1066MatMFFDDSSetUmin(Mat mat, PetscReal umin); 1067``` 1068 1069This parameter, `umin` (or $u_{min}$), is a bit involved; its 1070default is $10^{-6}$ . Its command line form is `-mat_mffd_umin <umin>`. 1071 1072The Jacobian-vector product is approximated 1073via the formula 1074 1075$$ 1076F'(u) a \approx \frac{F(u + h*a) - F(u)}{h} 1077$$ 1078 1079where $h$ is computed via 1080 1081$$ 1082h = e_{\text{rel}} \cdot \begin{cases} 1083u^{T}a/\lVert a \rVert^2_2 & \text{if $|u^T a| > u_{\min} \lVert a \rVert_{1}$} \\ 1084u_{\min} \operatorname{sign}(u^{T}a) \lVert a \rVert_{1}/\lVert a\rVert^2_2 & \text{otherwise}. 1085\end{cases} 1086$$ 1087 1088This approach is taken from Brown and Saad 1089{cite}`brownsaad:90`. The second approach, taken from Walker and Pernice, 1090{cite}`pw98`, computes $h$ via 1091 1092$$ 1093\begin{aligned} 1094 h = \frac{\sqrt{1 + ||u||}e_{rel}}{||a||}\end{aligned} 1095$$ 1096 1097This has no tunable parameters, but note that inside the nonlinear solve 1098for the entire *linear* iterative process $u$ does not change 1099hence $\sqrt{1 + ||u||}$ need be computed only once. This 1100information may be set with the options 1101 1102``` 1103MatMFFDWPSetComputeNormU(Mat, PetscBool); 1104``` 1105 1106or `-mat_mffd_compute_normu <true or false>`. This information is used 1107to eliminate the redundant computation of these parameters, therefore 1108reducing the number of collective operations and improving the 1109efficiency of the application code. This takes place automatically for the PETSc GMRES solver with left preconditioning. 1110 1111It is also possible to monitor the differencing parameters h that are 1112computed via the routines 1113 1114``` 1115MatMFFDSetHHistory(Mat, PetscScalar *, int); 1116MatMFFDResetHHistory(Mat, PetscScalar *, int); 1117MatMFFDGetH(Mat, PetscScalar *); 1118``` 1119 1120We include an explicit example of using matrix-free methods in {any}`ex3.c <snes_ex3>`. 1121Note that by using the option `-snes_mf` one can 1122easily convert any `SNES` code to use a matrix-free Newton-Krylov 1123method without a preconditioner. As shown in this example, 1124`SNESSetFromOptions()` must be called *after* `SNESSetJacobian()` to 1125enable runtime switching between the user-specified Jacobian and the 1126default `SNES` matrix-free form. 1127 1128(snes_ex3)= 1129 1130:::{admonition} Listing: `src/snes/tutorials/ex3.c` 1131```{literalinclude} /../src/snes/tutorials/ex3.c 1132:end-before: /*TEST 1133``` 1134::: 1135 1136Table {any}`tab-jacobians` summarizes the various matrix situations 1137that `SNES` supports. In particular, different linear system matrices 1138and preconditioning matrices are allowed, as well as both matrix-free 1139and application-provided preconditioners. If {any}`ex3.c <snes_ex3>` is run with 1140the options `-snes_mf` and `-user_precond` then it uses a 1141matrix-free application of the matrix-vector multiple and a user 1142provided matrix-free Jacobian. 1143 1144```{eval-rst} 1145.. list-table:: Jacobian Options 1146 :name: tab-jacobians 1147 1148 * - Matrix Use 1149 - Conventional Matrix Formats 1150 - Matrix-free versions 1151 * - Jacobian Matrix 1152 - Create matrix with ``MatCreate()``:math:`^*`. Assemble matrix with user-defined routine :math:`^\dagger` 1153 - Create matrix with ``MatCreateShell()``. Use ``MatShellSetOperation()`` to set various matrix actions, or use ``MatCreateMFFD()`` or ``MatCreateSNESMF()``. 1154 * - Matrix used to construct the preconditioner 1155 - Create matrix with ``MatCreate()``:math:`^*`. Assemble matrix with user-defined routine :math:`^\dagger` 1156 - Use ``SNESGetKSP()`` and ``KSPGetPC()`` to access the ``PC``, then use ``PCSetType(pc, PCSHELL)`` followed by ``PCShellSetApply()``. 1157``` 1158 1159$^*$ Use either the generic `MatCreate()` or a format-specific variant such as `MatCreateAIJ()`. 1160 1161$^\dagger$ Set user-defined matrix formation routine with `SNESSetJacobian()` or with a `DM` variant such as `DMDASNESSetJacobianLocal()` 1162 1163SNES also provides some less well-integrated code to apply matrix-free finite differencing using an automatically computed measurement of the 1164noise of the functions. This can be selected with `-snes_mf_version 2`; it does not use `MatCreateMFFD()` but has similar options that start with 1165`-snes_mf_` instead of `-mat_mffd_`. Note that this alternative prefix **only** works for version 2 differencing. 1166 1167(sec_fdmatrix)= 1168 1169## Finite Difference Jacobian Approximations 1170 1171PETSc provides some tools to help approximate the Jacobian matrices 1172efficiently via finite differences. These tools are intended for use in 1173certain situations where one is unable to compute Jacobian matrices 1174analytically, and matrix-free methods do not work well without a 1175preconditioner, due to very poor conditioning. The approximation 1176requires several steps: 1177 1178- First, one colors the columns of the (not yet built) Jacobian matrix, 1179 so that columns of the same color do not share any common rows. 1180- Next, one creates a `MatFDColoring` data structure that will be 1181 used later in actually computing the Jacobian. 1182- Finally, one tells the nonlinear solvers of `SNES` to use the 1183 `SNESComputeJacobianDefaultColor()` routine to compute the 1184 Jacobians. 1185 1186A code fragment that demonstrates this process is given below. 1187 1188``` 1189ISColoring iscoloring; 1190MatFDColoring fdcoloring; 1191MatColoring coloring; 1192 1193/* 1194 This initializes the nonzero structure of the Jacobian. This is artificial 1195 because clearly if we had a routine to compute the Jacobian we wouldn't 1196 need to use finite differences. 1197*/ 1198FormJacobian(snes, x, &J, &J, &user); 1199 1200/* 1201 Color the matrix, i.e. determine groups of columns that share no common 1202 rows. These columns in the Jacobian can all be computed simultaneously. 1203*/ 1204MatColoringCreate(J, &coloring); 1205MatColoringSetType(coloring, MATCOLORINGSL); 1206MatColoringSetFromOptions(coloring); 1207MatColoringApply(coloring, &iscoloring); 1208MatColoringDestroy(&coloring); 1209/* 1210 Create the data structure that SNESComputeJacobianDefaultColor() uses 1211 to compute the actual Jacobians via finite differences. 1212*/ 1213MatFDColoringCreate(J, iscoloring, &fdcoloring); 1214ISColoringDestroy(&iscoloring); 1215MatFDColoringSetFunction(fdcoloring, (MatFDColoringFn *)FormFunction, &user); 1216MatFDColoringSetFromOptions(fdcoloring); 1217 1218/* 1219 Tell SNES to use the routine SNESComputeJacobianDefaultColor() 1220 to compute Jacobians. 1221*/ 1222SNESSetJacobian(snes, J, J, SNESComputeJacobianDefaultColor, fdcoloring); 1223``` 1224 1225Of course, we are cheating a bit. If we do not have an analytic formula 1226for computing the Jacobian, then how do we know what its nonzero 1227structure is so that it may be colored? Determining the structure is 1228problem dependent, but fortunately, for most structured grid problems 1229(the class of problems for which PETSc was originally designed) if one 1230knows the stencil used for the nonlinear function one can usually fairly 1231easily obtain an estimate of the location of nonzeros in the matrix. 1232This is harder in the unstructured case, but one typically knows where the nonzero entries are from the mesh topology and distribution of degrees of freedom. 1233If using `DMPlex` ({any}`ch_unstructured`) for unstructured meshes, the nonzero locations will be identified in `DMCreateMatrix()` and the procedure above can be used. 1234Most external packages for unstructured meshes have similar functionality. 1235 1236One need not necessarily use a `MatColoring` object to determine a 1237coloring. For example, if a grid can be colored directly (without using 1238the associated matrix), then that coloring can be provided to 1239`MatFDColoringCreate()`. Note that the user must always preset the 1240nonzero structure in the matrix regardless of which coloring routine is 1241used. 1242 1243PETSc provides the following coloring algorithms, which can be selected using `MatColoringSetType()` or via the command line argument `-mat_coloring_type`. 1244 1245```{eval-rst} 1246.. list-table:: 1247 :header-rows: 1 1248 1249 * - Algorithm 1250 - ``MatColoringType`` 1251 - ``-mat_coloring_type`` 1252 - Parallel 1253 * - smallest-last :cite:`more84` 1254 - ``MATCOLORINGSL`` 1255 - ``sl`` 1256 - No 1257 * - largest-first :cite:`more84` 1258 - ``MATCOLORINGLF`` 1259 - ``lf`` 1260 - No 1261 * - incidence-degree :cite:`more84` 1262 - ``MATCOLORINGID`` 1263 - ``id`` 1264 - No 1265 * - Jones-Plassmann :cite:`jp:pcolor` 1266 - ``MATCOLORINGJP`` 1267 - ``jp`` 1268 - Yes 1269 * - Greedy 1270 - ``MATCOLORINGGREEDY`` 1271 - ``greedy`` 1272 - Yes 1273 * - Natural (1 color per column) 1274 - ``MATCOLORINGNATURAL`` 1275 - ``natural`` 1276 - Yes 1277 * - Power (:math:`A^k` followed by 1-coloring) 1278 - ``MATCOLORINGPOWER`` 1279 - ``power`` 1280 - Yes 1281``` 1282 1283As for the matrix-free computation of Jacobians ({any}`sec_nlmatrixfree`), two parameters affect the accuracy of the 1284finite difference Jacobian approximation. These are set with the command 1285 1286``` 1287MatFDColoringSetParameters(MatFDColoring fdcoloring, PetscReal rerror, PetscReal umin); 1288``` 1289 1290The parameter `rerror` is the square root of the relative error in the 1291function evaluations, $e_{rel}$; the default is the square root of 1292machine epsilon (about $10^{-8}$ in double precision), which 1293assumes that the functions are evaluated approximately to floating-point 1294precision accuracy. The second parameter, `umin`, is a bit more 1295involved; its default is $10^{-6}$. Column $i$ of the 1296Jacobian matrix (denoted by $F_{:i}$) is approximated by the 1297formula 1298 1299$$ 1300F'_{:i} \approx \frac{F(u + h*dx_{i}) - F(u)}{h} 1301$$ 1302 1303where $h$ is computed via: 1304 1305$$ 1306h = e_{\text{rel}} \cdot \begin{cases} 1307u_{i} & \text{if $|u_{i}| > u_{\min}$} \\ 1308u_{\min} \cdot \operatorname{sign}(u_{i}) & \text{otherwise}. 1309\end{cases} 1310$$ 1311 1312for `MATMFFD_DS` or: 1313 1314$$ 1315h = e_{\text{rel}} \sqrt{\|u\|} 1316$$ 1317 1318for `MATMFFD_WP` (default). These parameters may be set from the options 1319database with 1320 1321``` 1322-mat_fd_coloring_err <err> 1323-mat_fd_coloring_umin <umin> 1324-mat_fd_type <htype> 1325``` 1326 1327Note that `MatColoring` type `MATCOLORINGSL`, `MATCOLORINGLF`, and 1328`MATCOLORINGID` are sequential algorithms. `MATCOLORINGJP` and 1329`MATCOLORINGGREEDY` are parallel algorithms, although in practice they 1330may create more colors than the sequential algorithms. If one computes 1331the coloring `iscoloring` reasonably with a parallel algorithm or by 1332knowledge of the discretization, the routine `MatFDColoringCreate()` 1333is scalable. An example of this for 2D distributed arrays is given below 1334that uses the utility routine `DMCreateColoring()`. 1335 1336``` 1337DMCreateColoring(dm, IS_COLORING_GHOSTED, &iscoloring); 1338MatFDColoringCreate(J, iscoloring, &fdcoloring); 1339MatFDColoringSetFromOptions(fdcoloring); 1340ISColoringDestroy(&iscoloring); 1341``` 1342 1343Note that the routine `MatFDColoringCreate()` currently is only 1344supported for the AIJ and BAIJ matrix formats. 1345 1346(sec_vi)= 1347 1348## Variational Inequalities 1349 1350`SNES` can also solve (differential) variational inequalities with box (bound) constraints. 1351These are nonlinear algebraic systems with additional inequality 1352constraints on some or all of the variables: 1353$L_i \le u_i \le H_i$. For example, the pressure variable cannot be negative. 1354Some, or all, of the lower bounds may be 1355negative infinity (indicated to PETSc with `SNES_VI_NINF`) and some, or 1356all, of the upper bounds may be infinity (indicated by `SNES_VI_INF`). 1357The commands 1358 1359``` 1360SNESVISetVariableBounds(SNES snes, Vec L, Vec H); 1361SNESVISetComputeVariableBounds(SNES snes, PetscErrorCode (*compute)(SNES, Vec, Vec)) 1362``` 1363 1364are used to indicate that one is solving a variational inequality. Problems with box constraints can be solved with 1365the reduced space, `SNESVINEWTONRSLS`, and semi-smooth `SNESVINEWTONSSLS` solvers. 1366 1367The 1368option `-snes_vi_monitor` turns on extra monitoring of the active set 1369associated with the bounds and `-snes_vi_type` allows selecting from 1370several VI solvers, the default is preferred. 1371 1372`SNESLineSearchSetPreCheck()` and `SNESLineSearchSetPostCheck()` can also be used to control properties 1373of the steps selected by `SNES`. 1374 1375(sec_snespc)= 1376 1377## Nonlinear Preconditioning 1378 1379The mathematical framework of nonlinear preconditioning is explained in detail in {cite}`bruneknepleysmithtu15`. 1380Nonlinear preconditioning in PETSc involves the use of an inner `SNES` 1381instance to define the step for an outer `SNES` instance. The inner 1382instance may be extracted using 1383 1384``` 1385SNESGetNPC(SNES snes, SNES *npc); 1386``` 1387 1388and passed run-time options using the `-npc_` prefix. Nonlinear 1389preconditioning comes in two flavors: left and right. The side may be 1390changed using `-snes_npc_side` or `SNESSetNPCSide()`. Left nonlinear 1391preconditioning redefines the nonlinear function as the action of the 1392nonlinear preconditioner $\mathbf{M}$; 1393 1394$$ 1395\mathbf{F}_{M}(x) = \mathbf{M}(\mathbf{x},\mathbf{b}) - \mathbf{x}. 1396$$ 1397 1398Right nonlinear preconditioning redefines the nonlinear function as the 1399function on the action of the nonlinear preconditioner; 1400 1401$$ 1402\mathbf{F}(\mathbf{M}(\mathbf{x},\mathbf{b})) = \mathbf{b}, 1403$$ 1404 1405which can be interpreted as putting the preconditioner into “striking 1406distance” of the solution by outer acceleration. 1407 1408In addition, basic patterns of solver composition are available with the 1409`SNESType` `SNESCOMPOSITE`. This allows for two or more `SNES` 1410instances to be combined additively or multiplicatively. By command 1411line, a set of `SNES` types may be given by comma separated list 1412argument to `-snes_composite_sneses`. There are additive 1413(`SNES_COMPOSITE_ADDITIVE`), additive with optimal damping 1414(`SNES_COMPOSITE_ADDITIVEOPTIMAL`), and multiplicative 1415(`SNES_COMPOSITE_MULTIPLICATIVE`) variants which may be set with 1416 1417``` 1418SNESCompositeSetType(SNES, SNESCompositeType); 1419``` 1420 1421New subsolvers may be added to the composite solver with 1422 1423``` 1424SNESCompositeAddSNES(SNES, SNESType); 1425``` 1426 1427and accessed with 1428 1429``` 1430SNESCompositeGetSNES(SNES, PetscInt, SNES *); 1431``` 1432 1433```{eval-rst} 1434.. bibliography:: /petsc.bib 1435 :filter: docname in docnames 1436``` 1437