1--- 2orphan: true 3--- 4 5(tut_stokes)= 6 7# Guide to the Stokes Equations using Finite Elements 8 9The Stokes equations 10 11$$ 12\begin{aligned} 13-\nabla \cdot \left(\mu \left(\nabla u + \nabla u^T \right)\right) + \nabla p + f &= 0 \\ 14\nabla\cdot u &= 0 \end{aligned} 15$$ 16 17describe slow flow of an incompressible fluid with velocity $u$, pressure $p$, and body force $f$. 18 19This guide accompanies <a href="PETSC_DOC_OUT_ROOT_PLACEHOLDER/src/snes/tutorials/ex62.c.html">SNES Example 62></a> and <a href="PETSC_DOC_OUT_ROOT_PLACEHOLDER/src/snes/tutorials/ex69.c.html">SNES Example 69><\a>. 20 21The Stokes equations for a fluid, a steady-state form of the Navier-Stokes equations, start with the balance of momentum, just as in elastostatics, 22 23$$ 24\nabla \cdot \sigma + f = 0, 25$$ 26 27where $\sigma$ is the stress tensor and $f$ is the body force, combined with the conservation of mass 28 29$$ 30\nabla \cdot (\rho u) = 0, 31$$ 32 33where $\rho$ is the density and $u$ is the fluid velocity. If we assume that the density is constant, making the fluid incompressible, and that the rheology is Newtonian, meaning that the viscous stress is linearly proportional to the local strain rate, then we have 34 35$$ 36\begin{aligned} 37 \nabla \cdot \mu \left( \nabla u + \nabla u^T \right) - \nabla p + f &= 0 \\ 38 \nabla \cdot u &= 0 39\end{aligned} 40$$ 41 42where $p$ is the pressure, $\mu$ is the dynamic shear viscosity, with units $N\cdot s/m^2$ or $Pa\cdot s$. If we divide by the constant density, we would have the kinematic viscosity $\nu$ and a force per unit mass. The second equation demands that the velocity field be divergence-free, indicating that the flow is incompressible. The pressure in this case can be thought of as the Lagrange multiplier enforcing the incompressibility constraint. In the compressible case, we would need an equation of state to relate the pressure to the density, and perhaps temperature. 43 44We will discretize our Stokes equations with finite elements, so the first step is to write a variational weak form of the equations. We choose to use a Ritz-Galerkin setup, so let our velocity $u \in V$ and pressure $p \in Q$, so that 45 46$$ 47\begin{aligned} 48 \left< \nabla v, \mu \left( \nabla u + \nabla u^T \right) \right> + \left< v, \frac{\partial\sigma}{\partial n} \right>_\Gamma - \left< \nabla\cdot v, p \right> - \left< v, f \right> &= 0 & \text{for all} \ v \in V\\ 49 \left< q, -\nabla \cdot u \right> &= 0 & \text{for all} \ q \in Q 50\end{aligned} 51$$ 52 53where integration by parts has added a boundary integral over the normal derivative of the stress (traction), and natural boundary conditions correspond to stress-free boundaries. We have multiplied the continuity equation by minus one in order to preserve symmetry. 54 55## Equation Definition 56 57The test functions $v, q$ and their derivatives are determined by the discretization, whereas the form of the integrand is determined by the physics. Given a quadrature rule to evaluate the form integral, we would only need the evaluation of the physics integrand at the quadrature points, given the values of the fields and their derivatives. The entire scheme is detailed in {cite}`knepleybrownruppsmith13`. The kernels paired with test functions we will call $f_0$ and those paired with gradients of test functions will be called $f_1$. 58 59For example, the kernel for the continuity equation, paired with the pressure test function, is called `f0_p` and can be seen here 60 61```{literalinclude} /../src/snes/tutorials/ex62.c 62:end-at: '}' 63:start-at: static void f0_p( 64``` 65 66We use the components of the Jacobian of $u$ to build up its divergence. For the balance of momentum excluding body force, we test against the gradient of the test function, as seen in `f1_u`, 67 68```{literalinclude} /../src/snes/tutorials/ex62.c 69:end-at: '}' 70:start-at: static void f1_u( 71``` 72 73Notice how the pressure $p$ is referred to using `u[uOff[1]]` so that we can have many fields with different numbers of components. `DMPlex` uses these point functions to construct the residual. A similar set of point functions is also used to build the Jacobian. The last piece of our physics specification is the construction of exact solutions using the Method of Manufactured Solutions (MMS). 74 75## MMS Solutions 76 77An MMS solution is chosen to elucidate some property of the problem, and to check that it is being solved accurately, since the error can be calculated explicitly. For our Stokes problem, we first choose a solution with quadratic velocity and linear pressure, 78 79$$ 80u = \begin{pmatrix} x^2 + y^2 \\ 2 x^2 - 2 x y \end{pmatrix} \quad \mathrm{or} \quad \begin{pmatrix} 2 x^2 + y^2 + z^2 \\ 2 x^2 - 2xy \\ 2 x^2 - 2xz \end{pmatrix} 81$$ 82 83```{literalinclude} /../src/snes/tutorials/ex62.c 84:append: '}' 85:end-at: return 0; 86:start-at: static PetscErrorCode quadratic_u 87``` 88 89$$ 90p = x + y - 1 \quad \mathrm{or} \quad x + y + z - \frac{3}{2} 91$$ 92 93```{literalinclude} /../src/snes/tutorials/ex62.c 94:append: '}' 95:end-at: return 0; 96:start-at: static PetscErrorCode quadratic_p 97``` 98 99By plugging these solutions into our equations, assuming that the velocity we choose is divergence-free, we can determine the body force necessary to make them satisfy the Stokes equations. For the quadratic solution above, we find 100 101$$ 102f = \begin{pmatrix} 1 - 4\mu \\ 1 - 4\mu \end{pmatrix} \quad \mathrm{or} \quad \begin{pmatrix} 1 - 8\mu \\ 1 - 4\mu \\ 1 - 4\mu \end{pmatrix} 103$$ 104 105which is implemented in our `f0_quadratic_u` pointwise function 106 107```{literalinclude} /../src/snes/tutorials/ex62.c 108:end-at: '}' 109:start-at: static void f0_quadratic_u 110``` 111 112We let PETSc know about these solutions 113 114```{literalinclude} /../src/snes/tutorials/ex62.c 115:end-at: PetscCall(PetscDSSetExactSolution(ds, 1 116:start-at: PetscCall(PetscDSSetExactSolution(ds, 0 117``` 118 119These solutions will be captured exactly by the $P_2-P_1$ finite element space. We can use the `-dmsnes_check` option to activate function space checks. It gives the $L_2$ error, or *discretization* error, of the exact solution, the residual computed using the interpolation of the exact solution into our finite element space, and uses a Taylor test to check that our Jacobian matches the residual. It should converge at order 2, or be exact in the case of linear equations like Stokes. Our $P_2-P_1$ runs in the PETSc test section at the bottom of the source file 120 121```{literalinclude} /../src/snes/tutorials/ex62.c 122:lines: 1-3 123:start-at: 'suffix: 2d_p2_p1_check' 124``` 125 126```{literalinclude} /../src/snes/tutorials/ex62.c 127:lines: 1-3 128:start-at: 'suffix: 3d_p2_p1_check' 129``` 130 131verify these claims, as we can see from the output files 132 133```{literalinclude} /../src/snes/tutorials/output/ex62_2d_p2_p1_check.out 134:language: none 135``` 136 137```{literalinclude} /../src/snes/tutorials/output/ex62_3d_p2_p1_check.out 138:language: none 139``` 140 141We can carry out the same tests for the $Q_2-Q_1$ element, 142 143```{literalinclude} /../src/snes/tutorials/ex62.c 144:lines: 1-2 145:start-at: 'suffix: 2d_q2_q1_check' 146``` 147 148```{literalinclude} /../src/snes/tutorials/ex62.c 149:lines: 1-2 150:start-at: 'suffix: 3d_q2_q1_check' 151``` 152 153The quadratic solution, however, cannot tell us whether our discretization is attaining the correct order of convergence, especially for higher order elements. Thus, we will define another solution based on trigonometric functions. 154 155$$ 156u = \begin{pmatrix} \sin(\pi x) + \sin(\pi y) \\ -\pi \cos(\pi x) y \end{pmatrix} \quad \mathrm{or} \quad 157 \begin{pmatrix} 2 \sin(\pi x) + \sin(\pi y) + \sin(\pi z) \\ -\pi \cos(\pi x) y \\ -\pi \cos(\pi x) z \end{pmatrix} 158$$ 159 160```{literalinclude} /../src/snes/tutorials/ex62.c 161:append: '}' 162:end-at: return 0; 163:start-at: static PetscErrorCode trig_u 164``` 165 166$$ 167p = \sin(2 \pi x) + \sin(2 \pi y) \quad \mathrm{or} \quad \sin(2 \pi x) + \sin(2 \pi y) + \sin(2 \pi z) 168$$ 169 170```{literalinclude} /../src/snes/tutorials/ex62.c 171:append: '}' 172:end-at: return 0; 173:start-at: static PetscErrorCode trig_p 174``` 175 176$$ 177f = \begin{pmatrix} 2 \pi \cos(2 \pi x) + \mu \pi^2 \sin(\pi x) + \mu \pi^2 \sin(\pi y) \\ 2 \pi \cos(2 \pi y) - \mu \pi^3 \cos(\pi x) y \end{pmatrix} \quad \mathrm{or} \quad 178\begin{pmatrix} 2 \pi \cos(2 \pi x) + 2\mu \pi^2 \sin(\pi x) + \mu \pi^2 \sin(\pi y) + \mu \pi^2 \sin(\pi z) \\ 2 \pi \cos(2 \pi y) - \mu \pi^3 cos(\pi x) y \\ 2 \pi \cos(2 \pi z) - \mu \pi^3 \cos(\pi x) z \end{pmatrix} 179$$ 180 181```{literalinclude} /../src/snes/tutorials/ex62.c 182:append: '}' 183:end-at: '}' 184:start-at: static void f0_quadratic_u 185``` 186 187We can now use `-snes_convergence_estimate` to determine the convergence exponent for the discretization. This options solves the problem on a series of refined meshes, calculates the error on each mesh, and determines the slope on a logarithmic scale. For example, we do this in two dimensions, refining our mesh twice using `-convest_num_refine 2` in the following test. 188 189```{literalinclude} /../src/snes/tutorials/ex62.c 190:end-before: 'test:' 191:start-at: 'suffix: 2d_p2_p1_conv' 192``` 193 194However, the test needs an accurate linear solver. Sparse LU factorizations do not, in general, do full pivoting. Thus we must deal with the zero pressure block explicitly. We use the `PCFIELDSPLIT` preconditioner and the full Schur complement factorization, but we still need a preconditioner for the Schur complement $B^T A^{-1} B$. We can have PETSc construct that matrix automatically, but the cost rises steeply as the problem size increases. Instead, we use the fact that the Schur complement is spectrally equivalent to the pressure mass matrix $M_p$. We can make a matrix from which the preconditioner is constructed, which has the diagonal blocks we will use to build the preconditioners, letting PETSc know that we get the off-diagonal blocks from the original system with `-pc_fieldsplit_off_diag_use_amat` and to build the Schur complement from the original matrix using `-pc_use_amat`, 195 196```{literalinclude} /../src/snes/tutorials/ex62.c 197:end-at: PetscCall(PetscDSSetJacobianPreconditioner(ds, 1 198:start-at: PetscCall(PetscDSSetJacobianPreconditioner(ds, 0 199``` 200 201Putting this all together, and using exact solvers on the subblocks, we have 202 203```{literalinclude} /../src/snes/tutorials/ex62.c 204:end-before: 'test:' 205:start-at: 'suffix: 2d_p2_p1_conv' 206``` 207 208and we see it converges, however it is superconverging in the pressure, 209 210```{literalinclude} /../src/snes/tutorials/output/ex62_2d_p2_p1_conv.out 211``` 212 213If we refine the mesh using `-dm_refine 3`, the convergence rates become `[3.0, 2.1]`. 214 215## Dealing with Parameters 216 217Like most physical problems, the Stokes problem has a parameter, the dynamic shear viscosity, which determines what solution regime we are in. To handle these parameters in PETSc, we first define a C struct to hold them, 218 219```{literalinclude} /../src/snes/tutorials/ex62.c 220:end-at: '} Parameter;' 221:start-at: typedef struct 222``` 223 224and then add a `PetscBag` object to our application context. We then setup the parameter object, 225 226```{literalinclude} /../src/snes/tutorials/ex62.c 227:append: '}' 228:end-at: PetscFunctionReturn(PETSC_SUCCESS); 229:start-at: static PetscErrorCode SetupParameters 230``` 231 232which will allow us to set the value from the command line using `-mu`. The `PetscBag` can also be persisted to disk with `PetscBagLoad/View()`. We can make these values available as constant to our pointwise functions through the `PetscDS` object. 233 234```{literalinclude} /../src/snes/tutorials/ex62.c 235:end-at: '}' 236:start-after: /* Make constant values 237``` 238 239## Investigating convergence 240 241In order to look at the convergence of some harder problems, we will examine `SNES ex69`. This example provides an exact solution to the variable viscosity Stokes equation. The sharp viscosity variation will allow us to investigate convergence of the solver and discretization. Briefly, a sharp viscosity variation is created across the unit square, imposed on a background pressure with given fundamental frequency. For example, we can create examples with period one half and viscosity $e^{2 B x}$ (solKx) 242 243```console 244$ make -f ./gmakefile test search="snes_tutorials-ex69_p2p1" EXTRA_OPTIONS="-dm_refine 5 -dm_view hdf5:$PETSC_DIR/sol.h5 -snes_view_solution hdf5:$PETSC_DIR/sol.h5::append -exact_vec_view hdf5:$PETSC_DIR/sol.h5::append -m 2 -n 2 -B 1" 245$ make -f ./gmakefile test search="snes_tutorials-ex69_p2p1" EXTRA_OPTIONS="-dm_refine 5 -dm_view hdf5:$PETSC_DIR/sol.h5 -snes_view_solution hdf5:$PETSC_DIR/sol.h5::append -exact_vec_view hdf5:$PETSC_DIR/sol.h5::append -m 2 -n 2 -B 3.75" 246``` 247 248which are show in the figure below. 249 250```{eval-rst} 251.. list-table:: 252 253 * - .. figure:: /images/tutorials/physics/ex69_sol_m_2_n_2_B_1.png 254 255 Solution for :math:`m=2`, :math:`n=2`, :math:`B=1` 256 257 - .. figure:: /images/tutorials/physics/ex69_sol_m_2_n_2_B_375.png 258 259 Solution for :math:`m=2`, :math:`n=2`, :math:`B=3.75` 260``` 261 262### Debugging 263 264If we can provide the `PetscDS` object in our problem with the exact solution function, PETSc has good support for debugging our discretization and solver. We can use the `PetscConvEst` object to check the convergence behavior of our element automatically. For example, if we use the `-snes_convergence_estimate` option, PETSc will solve our nonlinear equations on a series of refined meshes, use our exact solution to calculate the error, and then fit this line on a log-log scale to get the convergence rate, 265 266```{literalinclude} /../src/snes/tutorials/ex69.c 267:end-before: 'test:' 268:start-at: 'suffix: p2p1_conv' 269``` 270 271If we initially refine the mesh twice, `-dm_refine 2`, we get 272 273> L_2 convergence rate: [3.0, 2.2] 274 275which are the convergence rates we expect for the velocity and pressure using a $P_2-P_1$ discretization. For $Q_1-P_0$ 276 277```{literalinclude} /../src/snes/tutorials/ex69.c 278:end-before: 'test:' 279:start-at: 'suffix: q1p0_conv' 280``` 281 282we get 283 284> L_2 convergence rate: [2.0, 1.0] 285 286This is a sensitive check that everything is working correctly. However, if this is wrong, where can I start? More fine-grained checks are available using the `-dmsnes_check` option. Using this for our $P_2-P_1$ example (the `p2p1` test), we have 287 288```{literalinclude} /../src/snes/tutorials/output/ex69_p2p1.out 289``` 290 291The first line records the discretization error for our exact solution. This means that we project our solution function into the finite element space and then calculate the $L_2$ norm of the difference between the exact solution and its projection. The norm is computed for each field separately. Next, PETSc calculates the residual using the projected exact solution as input. This should be small, and as the mesh is refined it should approach zero. Last, PETSc uses a Taylor test to try and determine how the error in the linear model scales as a function of the perturbation $h$. Thus, in a nonlinear situation we would expect 292 293> Taylor approximation converging at order 2.0 294 295In this case, since the viscosity does not depend on the velocity or pressure fields, we detect that the linear model is exact 296 297> Function appears to be linear 298 299Suppose that we have made an error in the Jacobian. For instance, let us accidentally flip the sign of the pressure term in the momentum Jacobian. 300 301```{literalinclude} /../src/snes/tutorials/ex69.c 302:end-at: '}' 303:start-at: static void stokes_momentum_pres_J 304``` 305 306When we run, we get a failure of the nonlinear solver. Our checking reveals that the Jacobian is wrong because it is converging at order 1 instead of 2, meaning the linear term is not correct in our model. 307 308```console 309$ make -f ./gmakefile test search="snes_tutorials-ex69_p2p1" EXTRA_OPTIONS="-snes_monitor -ksp_monitor_true_residual -ksp_converged_reason" 310L_2 Error: [0.000439127, 0.0376629] 311L_2 Residual: 0.0453958 312Taylor approximation converging at order 1.00 313 0 SNES Function norm 1.170604545948e-01 314 0 KSP preconditioned resid norm 4.965098891419e-01 true resid norm 1.170604545948e-01 ||r(i)||/||b|| 1.000000000000e+00 315 1 KSP preconditioned resid norm 9.236805404733e-11 true resid norm 1.460082233654e-12 ||r(i)||/||b|| 1.247289051378e-11 316 Linear solve converged due to CONVERGED_ATOL iterations 1 317[0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- 318[0]PETSC ERROR: 319[0]PETSC ERROR: SNESSolve has not converged 320``` 321 322In order to track down the error, we can use `-snes_test_jacobian` which computes a finite difference approximation to the Jacobian and compares that to the analytic Jacobian. We ignore the first test, which occurs during our testing of the Jacobian, and look at the test that happens during the first Newton iterate. We see that the relative error in the Frobenius norm is about one percent, which indicates we have a real problem. 323 324```console 325$ make -f ./gmakefile test search="snes_tutorials-ex69_p2p1" EXTRA_OPTIONS="-snes_monitor -ksp_monitor_true_residual -ksp_converged_reason -snes_test_jacobian" 326L_2 Error: [0.000439127, 0.0376629] 327L_2 Residual: 0.0453958 328 ---------- Testing Jacobian ------------- 329 Run with -snes_test_jacobian_view and optionally -snes_test_jacobian <threshold> to show difference 330 of hand-coded and finite difference Jacobian entries greater than <threshold>. 331 Testing hand-coded Jacobian, if (for double precision runs) ||J - Jfd||_F/||J||_F is 332 O(1.e-8), the hand-coded Jacobian is probably correct. 333 ||J - Jfd||_F/||J||_F = 136.793, ||J - Jfd||_F = 136.793 334 ---------- Testing Jacobian for preconditioner ------------- 335 ||J - Jfd||_F/||J||_F = 136.793, ||J - Jfd||_F = 136.793 336Taylor approximation converging at order 1.00 337 0 SNES Function norm 1.170604545948e-01 338 ---------- Testing Jacobian ------------- 339 ||J - Jfd||_F/||J||_F = 0.0119377, ||J - Jfd||_F = 1.63299 340 ---------- Testing Jacobian for preconditioner ------------- 341 ||J - Jfd||_F/||J||_F = 0.008471, ||J - Jfd||_F = 1.15873 342 0 KSP preconditioned resid norm 4.965098891419e-01 true resid norm 1.170604545948e-01 ||r(i)||/||b|| 1.000000000000e+00 343 1 KSP preconditioned resid norm 9.236804064319e-11 true resid norm 1.460031196842e-12 ||r(i)||/||b|| 1.247245452699e-11 344 Linear solve converged due to CONVERGED_ATOL iterations 1 345[0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- 346[0]PETSC ERROR: 347[0]PETSC ERROR: SNESSolve has not converged 348``` 349 350At this point, we could just go back and check the code. However, PETSc will also print out the differences between the analytic and approximate Jacobians. When we give the `-snes_test_jacobian_view` option, the code will print both Jacobians (which we omit) and then their difference, and will also do this for the matrix used to construct the preconditioner (which we omit). It is clear from the output that the $u-p$ block of the Jacobian is wrong, and thus we know right where to look for our error. Moreover, if we look at the values in row 15, we see that the values just differ by a sign. 351 352```console 353$ make -f ./gmakefile test search="snes_tutorials-ex69_p2p1" EXTRA_OPTIONS="-snes_monitor -ksp_monitor_true_residual -ksp_converged_reason -snes_test_jacobian" 354 Hand-coded minus finite-difference Jacobian with tolerance 1e-05 ---------- 355Mat Object: 1 MPI process 356 type: seqaij 357row 0: 358row 1: 359row 2: 360row 3: 361row 4: 362row 5: 363row 6: 364row 7: 365row 8: 366row 9: 367row 10: 368row 11: 369row 12: 370row 13: 371row 14: 372row 15: (0, 0.166667) (2, -0.166667) 373row 16: (0, 0.166667) (2, -0.166667) (5, 0.166667) (8, -0.166667) 374row 17: (0, 0.166667) (2, 0.166667) (5, -0.166667) (8, -0.166667) 375row 18: (0, 0.166667) (5, -0.166667) 376row 19: (5, 0.166667) (8, -0.166667) (11, 0.166667) (13, -0.166667) 377row 20: (5, 0.166667) (8, 0.166667) (11, -0.166667) (13, -0.166667) 378row 21: (5, 0.166667) (11, -0.166667) 379row 22: (5, 0.333333) (8, -0.333333) 380row 23: (2, 0.166667) (5, 0.166667) (8, -0.166667) (11, -0.166667) 381row 24: (2, 0.166667) (3, -0.166667) (5, 0.166667) (8, -0.166667) 382row 25: (2, 0.333333) (8, -0.333333) 383row 26: (2, 0.166667) (3, -0.166667) (8, 0.166667) (10, -0.166667) 384row 27: (2, 0.166667) (3, 0.166667) (8, -0.166667) (10, -0.166667) 385row 28: (3, 0.166667) (10, -0.166667) 386row 29: (8, 0.333333) (10, -0.333333) 387row 30: (3, 0.166667) (8, 0.166667) (10, -0.166667) (13, -0.166667) 388row 31: (2, 0.166667) (3, -0.166667) 389row 32: (8, 0.166667) (10, -0.166667) (13, 0.166667) (14, -0.166667) 390row 33: (8, 0.166667) (10, 0.166667) (13, -0.166667) (14, -0.166667) 391row 34: (10, 0.166667) (14, -0.166667) 392row 35: (13, 0.166667) (14, -0.166667) 393row 36: (8, 0.166667) (10, -0.166667) (11, 0.166667) (13, -0.166667) 394row 37: (8, 0.333333) (13, -0.333333) 395row 38: (11, 0.166667) (13, -0.166667) 396 0 KSP preconditioned resid norm 4.965098891419e-01 true resid norm 1.170604545948e-01 ||r(i)||/||b|| 1.000000000000e+00 397 1 KSP preconditioned resid norm 9.236804067326e-11 true resid norm 1.460031196842e-12 ||r(i)||/||b|| 1.247245452699e-11 398 Linear solve converged due to CONVERGED_ATOL iterations 1 399 [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- 400 [0]PETSC ERROR: 401 [0]PETSC ERROR: SNESSolve has not converged 402``` 403 404Can we see that the Schur complement of Q1-P0 is ill-conditioned? 405 406### Optimizing the Solver 407 408In order to see exactly what solver we have employed, we can use the `-snes_view` option. When checking $P_2-P_1$ convergence, we use an exact solver, but it must have several parts in order to deal with the saddle-point in the Jacobian. Using the test system to provide our extra option, we get 409 410```console 411$ make -f ./gmakefile test search="snes_tutorials-ex69_p2p1" EXTRA_OPTIONS="-snes_view" 412SNES Object: 1 MPI process 413 type: newtonls 414 maximum iterations=50, maximum function evaluations=10000 415 tolerances: relative=1e-08, absolute=1e-50, solution=1e-08 416 total number of linear solver iterations=1 417 total number of function evaluations=2 418 norm schedule ALWAYS 419 SNESLineSearch Object: 1 MPI process 420 type: bt 421 interpolation: cubic 422 alpha=1.000000e-04 423 maxstep=1.000000e+08, minlambda=1.000000e-12 424 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 425 maximum iterations=40 426 KSP Object: 1 MPI process 427 type: gmres 428 restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement 429 happy breakdown tolerance 1e-30 430 maximum iterations=10000, initial guess is zero 431 tolerances: relative=1e-09, absolute=1e-10, divergence=10000. 432 left preconditioning 433 using PRECONDITIONED norm type for convergence test 434 PC Object: 1 MPI process 435 type: fieldsplit 436 FieldSplit with Schur preconditioner, factorization FULL 437 Preconditioner for the Schur complement formed from A11 438 Split info: 439 Split number 0 Defined by IS 440 Split number 1 Defined by IS 441 KSP solver for A00 block 442 KSP Object: (fieldsplit_velocity_) 1 MPI process 443 type: gmres 444 restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement 445 happy breakdown tolerance 1e-30 446 maximum iterations=10000, initial guess is zero 447 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. 448 left preconditioning 449 using PRECONDITIONED norm type for convergence test 450 PC Object: (fieldsplit_velocity_) 1 MPI process 451 type: lu 452 out-of-place factorization 453 tolerance for zero pivot 2.22045e-14 454 matrix ordering: nd 455 factor fill ratio given 5., needed 1.15761 456 Factored matrix follows: 457 Mat Object: 1 MPI process 458 type: seqaij 459 rows=30, cols=30 460 package used to perform factorization: petsc 461 total: nonzeros=426, allocated nonzeros=426 462 using I-node routines: found 17 nodes, limit used is 5 463 linear system matrix followed by preconditioner matrix: 464 Mat Object: 1 MPI process 465 type: seqaij 466 rows=30, cols=30 467 total: nonzeros=368, allocated nonzeros=368 468 total number of mallocs used during MatSetValues calls=0 469 using I-node routines: found 20 nodes, limit used is 5 470 Mat Object: (fieldsplit_velocity_) 1 MPI process 471 type: seqaij 472 rows=30, cols=30 473 total: nonzeros=368, allocated nonzeros=368 474 total number of mallocs used during MatSetValues calls=0 475 using I-node routines: found 20 nodes, limit used is 5 476 KSP solver for S = A11 - A10 inv(A00) A01 477 KSP Object: (fieldsplit_pressure_) 1 MPI process 478 type: gmres 479 restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement 480 happy breakdown tolerance 1e-30 481 maximum iterations=10000, initial guess is zero 482 tolerances: relative=1e-09, absolute=1e-50, divergence=10000. 483 left preconditioning 484 using PRECONDITIONED norm type for convergence test 485 PC Object: (fieldsplit_pressure_) 1 MPI process 486 type: lu 487 out-of-place factorization 488 tolerance for zero pivot 2.22045e-14 489 matrix ordering: nd 490 factor fill ratio given 5., needed 1.2439 491 Factored matrix follows: 492 Mat Object: 1 MPI process 493 type: seqaij 494 rows=9, cols=9 495 package used to perform factorization: petsc 496 total: nonzeros=51, allocated nonzeros=51 497 not using I-node routines 498 linear system matrix followed by preconditioner matrix: 499 Mat Object: (fieldsplit_pressure_) 1 MPI process 500 type: schurcomplement 501 rows=9, cols=9 502 has attached null space 503 Schur complement A11 - A10 inv(A00) A01 504 A11 505 Mat Object: 1 MPI process 506 type: seqaij 507 rows=9, cols=9 508 total: nonzeros=41, allocated nonzeros=41 509 total number of mallocs used during MatSetValues calls=0 510 has attached null space 511 not using I-node routines 512 A10 513 Mat Object: 1 MPI process 514 type: seqaij 515 rows=9, cols=30 516 total: nonzeros=122, allocated nonzeros=122 517 total number of mallocs used during MatSetValues calls=0 518 not using I-node routines 519 KSP solver for A00 block viewable with the additional option -fc_fieldsplit_velocity_ksp_view 520 A01 521 Mat Object: 1 MPI process 522 type: seqaij 523 rows=30, cols=9 524 total: nonzeros=122, allocated nonzeros=122 525 total number of mallocs used during MatSetValues calls=0 526 using I-node routines: found 20 nodes, limit used is 5 527 Mat Object: (fieldsplit_pressure_) 1 MPI process 528 type: seqaij 529 rows=9, cols=9 530 total: nonzeros=41, allocated nonzeros=41 531 total number of mallocs used during MatSetValues calls=0 532 not using I-node routines 533 linear system matrix followed by preconditioner matrix: 534 Mat Object: 1 MPI process 535 type: seqaij 536 rows=39, cols=39 537 total: nonzeros=653, allocated nonzeros=653 538 total number of mallocs used during MatSetValues calls=0 539 has attached null space 540 using I-node routines: found 24 nodes, limit used is 5 541 Mat Object: (prec_) 1 MPI process 542 type: seqaij 543 rows=39, cols=39 544 total: nonzeros=653, allocated nonzeros=653 545 total number of mallocs used during MatSetValues calls=0 546 using I-node routines: found 24 nodes, limit used is 5 547``` 548 549Going through this piece-by-piece, we can see all the parts of our solver. At the top level, we have a `SNES` using Newton's method 550 551```console 552$ make -f ./gmakefile test search="snes_tutorials-ex69_p2p1" EXTRA_OPTIONS="-snes_view" 553SNES Object: 1 MPI process 554 type: newtonls 555 maximum iterations=50, maximum function evaluations=10000 556 tolerances: relative=1e-08, absolute=1e-50, solution=1e-08 557 total number of linear solver iterations=1 558 total number of function evaluations=2 559 norm schedule ALWAYS 560 SNESLineSearch Object: 1 MPI process 561 type: bt 562 interpolation: cubic 563 alpha=1.000000e-04 564 maxstep=1.000000e+08, minlambda=1.000000e-12 565 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 566 maximum iterations=40 567``` 568 569For each nonlinear step, we use `KSPGMRES` to solve the Newton equation, preconditioned by `PCFIELDSPLIT`. We split the problem into two blocks, with the split determined by our `DM`, and combine those blocks using a Schur complement. The Schur complement is faithful since we use the `FULL` factorization type. 570 571```console 572$ make -f ./gmakefile test search="snes_tutorials-ex69_p2p1" EXTRA_OPTIONS="-snes_view" 573 KSP Object: 1 MPI process 574 type: gmres 575 restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement 576 happy breakdown tolerance 1e-30 577 maximum iterations=10000, initial guess is zero 578 tolerances: relative=1e-09, absolute=1e-10, divergence=10000. 579 left preconditioning 580 using PRECONDITIONED norm type for convergence test 581 PC Object: 1 MPI process 582 type: fieldsplit 583 FieldSplit with Schur preconditioner, factorization FULL 584 Preconditioner for the Schur complement formed from A11 585 Split info: 586 Split number 0 Defined by IS 587 Split number 1 Defined by IS 588``` 589 590We form the preconditioner for the Schur complement from the (1,1) block of our matrix used to construct the preconditioner, which we have set to be the viscosity-weighted mass matrix 591 592```{literalinclude} /../src/snes/tutorials/ex69.c 593:end-before: /* 594:start-at: static void stokes_identity_J_kx 595``` 596 597The solver for the first block, representing the velocity, is GMRES/LU. Note that the prefix is `fieldsplit_velocity_`, constructed automatically from the name of the field in our DM. Also note that there are two matrices, one from our original matrix, and one from our matrix used to construct the preconditioner, but they are identical. In an optimized, scalable solver, this block would likely be solved by multigrid, but here we use LU for verification purposes. 598 599```console 600$ make -f ./gmakefile test search="snes_tutorials-ex69_p2p1" EXTRA_OPTIONS="-snes_view" 601 KSP solver for A00 block 602 KSP Object: (fieldsplit_velocity_) 1 MPI process 603 type: gmres 604 restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement 605 happy breakdown tolerance 1e-30 606 maximum iterations=10000, initial guess is zero 607 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. 608 left preconditioning 609 using PRECONDITIONED norm type for convergence test 610 PC Object: (fieldsplit_velocity_) 1 MPI process 611 type: lu 612 out-of-place factorization 613 tolerance for zero pivot 2.22045e-14 614 matrix ordering: nd 615 factor fill ratio given 5., needed 1.15761 616 Factored matrix follows: 617 Mat Object: 1 MPI process 618 type: seqaij 619 rows=30, cols=30 620 package used to perform factorization: petsc 621 total: nonzeros=426, allocated nonzeros=426 622 using I-node routines: found 17 nodes, limit used is 5 623 linear system matrix followed by preconditioner matrix: 624 Mat Object: 1 MPI process 625 type: seqaij 626 rows=30, cols=30 627 total: nonzeros=368, allocated nonzeros=368 628 total number of mallocs used during MatSetValues calls=0 629 using I-node routines: found 20 nodes, limit used is 5 630 Mat Object: (fieldsplit_velocity_) 1 MPI process 631 type: seqaij 632 rows=30, cols=30 633 total: nonzeros=368, allocated nonzeros=368 634 total number of mallocs used during MatSetValues calls=0 635 using I-node routines: found 20 nodes, limit used is 5 636``` 637 638The solver for the second block, with prefix `fieldsplit_pressure_`, is also GMRES/LU, however we cannot factor the Schur complement operator since we never explicitly assemble it. Thus we assemble the viscosity-weighted mass matrix on the pressure space as an approximation. Notice that the Schur complement has the velocity solver embedded in it. 639 640```console 641$ make -f ./gmakefile test search="snes_tutorials-ex69_p2p1" EXTRA_OPTIONS="-snes_view" 642 KSP solver for S = A11 - A10 inv(A00) A01 643 KSP Object: (fieldsplit_pressure_) 1 MPI process 644 type: gmres 645 restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement 646 happy breakdown tolerance 1e-30 647 maximum iterations=10000, initial guess is zero 648 tolerances: relative=1e-09, absolute=1e-50, divergence=10000. 649 left preconditioning 650 using PRECONDITIONED norm type for convergence test 651 PC Object: (fieldsplit_pressure_) 1 MPI process 652 type: lu 653 out-of-place factorization 654 tolerance for zero pivot 2.22045e-14 655 matrix ordering: nd 656 factor fill ratio given 5., needed 1.2439 657 Factored matrix follows: 658 Mat Object: 1 MPI process 659 type: seqaij 660 rows=9, cols=9 661 package used to perform factorization: petsc 662 total: nonzeros=51, allocated nonzeros=51 663 not using I-node routines 664 linear system matrix followed by preconditioner matrix: 665 Mat Object: (fieldsplit_pressure_) 1 MPI process 666 type: schurcomplement 667 rows=9, cols=9 668 has attached null space 669 Schur complement A11 - A10 inv(A00) A01 670 A11 671 Mat Object: 1 MPI process 672 type: seqaij 673 rows=9, cols=9 674 total: nonzeros=41, allocated nonzeros=41 675 total number of mallocs used during MatSetValues calls=0 676 has attached null space 677 not using I-node routines 678 A10 679 Mat Object: 1 MPI process 680 type: seqaij 681 rows=9, cols=30 682 total: nonzeros=122, allocated nonzeros=122 683 total number of mallocs used during MatSetValues calls=0 684 not using I-node routines 685 KSP of A00 686 KSP Object: (fieldsplit_velocity_) 1 MPI process 687 type: gmres 688 restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement 689 happy breakdown tolerance 1e-30 690 maximum iterations=10000, initial guess is zero 691 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. 692 left preconditioning 693 using PRECONDITIONED norm type for convergence test 694 PC Object: (fieldsplit_velocity_) 1 MPI process 695 type: lu 696 out-of-place factorization 697 tolerance for zero pivot 2.22045e-14 698 matrix ordering: nd 699 factor fill ratio given 5., needed 1.15761 700 Factored matrix follows: 701 Mat Object: 1 MPI process 702 type: seqaij 703 rows=30, cols=30 704 package used to perform factorization: petsc 705 total: nonzeros=426, allocated nonzeros=426 706 using I-node routines: found 17 nodes, limit used is 5 707 linear system matrix followed by preconditioner matrix: 708 Mat Object: 1 MPI process 709 type: seqaij 710 rows=30, cols=30 711 total: nonzeros=368, allocated nonzeros=368 712 total number of mallocs used during MatSetValues calls=0 713 using I-node routines: found 20 nodes, limit used is 5 714 Mat Object: (fieldsplit_velocity_) 1 MPI process 715 type: seqaij 716 rows=30, cols=30 717 total: nonzeros=368, allocated nonzeros=368 718 total number of mallocs used during MatSetValues calls=0 719 using I-node routines: found 20 nodes, limit used is 5 720 A01 721 Mat Object: 1 MPI process 722 type: seqaij 723 rows=30, cols=9 724 total: nonzeros=122, allocated nonzeros=122 725 total number of mallocs used during MatSetValues calls=0 726 using I-node routines: found 20 nodes, limit used is 5 727 Mat Object: (fieldsplit_pressure_) 1 MPI process 728 type: seqaij 729 rows=9, cols=9 730 total: nonzeros=41, allocated nonzeros=41 731 total number of mallocs used during MatSetValues calls=0 732 not using I-node routines 733``` 734 735Finally, the SNES viewer reports the system matrix and matrix from which the preconditioner is constructed 736 737```console 738$ make -f ./gmakefile test search="snes_tutorials-ex69_p2p1" EXTRA_OPTIONS="-snes_view" 739 linear system matrix followed by preconditioner matrix: 740 Mat Object: 1 MPI process 741 type: seqaij 742 rows=39, cols=39 743 total: nonzeros=653, allocated nonzeros=653 744 total number of mallocs used during MatSetValues calls=0 745 has attached null space 746 using I-node routines: found 24 nodes, limit used is 5 747 Mat Object: (prec_) 1 MPI process 748 type: seqaij 749 rows=39, cols=39 750 total: nonzeros=653, allocated nonzeros=653 751 total number of mallocs used during MatSetValues calls=0 752 using I-node routines: found 24 nodes, limit used is 5 753``` 754 755We see that they have the same nonzero pattern, even though the matrix used to construct the preconditioner only contains the diagonal blocks. This is because zeros were inserted to define the nonzero structure. We can remove these nonzeros by telling the DM not to insert zero at preallocation time, and also telling the matrix itself to ignore the zeros from the assembly process. 756 757```console 758$ make -f ./gmakefile test search="snes_tutorials-ex69_p2p1" EXTRA_OPTIONS="-snes_view -dm_preallocate_only -prec_mat_ignore_zero_entries" 759 linear system matrix followed by preconditioner matrix: 760 Mat Object: 1 MPI process 761 type: seqaij 762 rows=39, cols=39 763 total: nonzeros=653, allocated nonzeros=653 764 total number of mallocs used during MatSetValues calls=0 765 has attached null space 766 using I-node routines: found 24 nodes, limit used is 5 767 Mat Object: (prec_) 1 MPI process 768 type: seqaij 769 rows=39, cols=39 770 total: nonzeros=409, allocated nonzeros=653 771 total number of mallocs used during MatSetValues calls=0 772 using I-node routines: found 29 nodes, limit used is 5 773``` 774 775We can see a sparsity portrait of the system and preconditioning matrices if the installation supports X-windows visualization 776 777```console 778$ make -f ./gmakefile test search="snes_tutorials-ex69_p2p1" EXTRA_OPTIONS="-ksp_view_mat draw -prec_mat_view draw -draw_pause -1" 779$ make -f ./gmakefile test search="snes_tutorials-ex69_p2p1" EXTRA_OPTIONS="-ksp_view_mat draw -prec_mat_view draw -draw_save $PETSC_DIR/mat.png" 780$ make -f ./gmakefile test search="snes_tutorials-ex69_p2p1" EXTRA_OPTIONS="-dm_preallocate_only -mat_ignore_zero_entries -prec_mat_ignore_zero_entries -ksp_view_mat draw -prec_mat_view draw -draw_save $PETSC_DIR/mat_sparse.png" 781``` 782 783```{eval-rst} 784.. list-table:: 785 786 * - .. figure:: /images/tutorials/physics/stokes_p2p1_sys_mat.png 787 788 System matrix 789 790 - .. figure:: /images/tutorials/physics/stokes_p2p1_sys_mat_sparse.png 791 792 System matrix with sparse stencil 793 794 * - .. figure:: /images/tutorials/physics/stokes_p2p1_prec_mat.png 795 796 Matrix used to construct the preconditioner 797 798 - .. figure:: /images/tutorials/physics/stokes_p2p1_prec_mat_sparse.png 799 800 Matrix used to construct the preconditioner with sparse stencil 801``` 802 803If we want to check the convergence of the solver, we can also do that using options. Both the linear and nonlinear solvers converge in a single iteration, which is exactly what we want. In order to have this happen, we must have the tolerance on both the outer KSP solver and the inner Schur complement solver be low enough. Notice that the sure complement solver is used twice, and converges in seven iterates each time. 804 805```console 806$ make -f ./gmakefile test search="snes_tutorials-ex69_p2p1" EXTRA_OPTIONS="-snes_monitor -ksp_monitor_true_residual -ksp_converged_reason -fieldsplit_pressure_ksp_converged_reason" 8070 SNES Function norm 1.170604545948e-01 808 Linear fieldsplit_pressure_ solve converged due to CONVERGED_RTOL iterations 7 809 0 KSP preconditioned resid norm 4.965098891419e-01 true resid norm 1.170604545948e-01 ||r(i)||/||b|| 1.000000000000e+00 810 Linear fieldsplit_pressure_ solve converged due to CONVERGED_RTOL iterations 7 811 1 KSP preconditioned resid norm 9.236813926190e-11 true resid norm 1.460072673561e-12 ||r(i)||/||b|| 1.247280884579e-11 812Linear solve converged due to CONVERGED_ATOL iterations 1 8131 SNES Function norm 1.460070661322e-12 814``` 815 816We can look at the scalability of the solve by refining the mesh. We see that the Schur complement solve looks robust to grid refinement. 817 818```console 819$ make -f ./gmakefile test search="snes_tutorials-ex69_p2p1" EXTRA_OPTIONS="-dm_refine 2 -snes_monitor -ksp_monitor_true_residual -ksp_converged_reason -fieldsplit_pressure_ksp_converged_reason" 8200 SNES Function norm 3.503062983054e-02 821 Linear fieldsplit_pressure_ solve converged due to CONVERGED_RTOL iterations 8 822 0 KSP preconditioned resid norm 9.943095979973e-01 true resid norm 3.503062983054e-02 ||r(i)||/||b|| 1.000000000000e+00 823 Linear fieldsplit_pressure_ solve converged due to CONVERGED_RTOL iterations 8 824 1 KSP preconditioned resid norm 1.148772629230e-10 true resid norm 2.693482255004e-13 ||r(i)||/||b|| 7.688934706664e-12 825Linear solve converged due to CONVERGED_RTOL iterations 1 8261 SNES Function norm 2.693649920420e-13 827$ make -f ./gmakefile test search="snes_tutorials-ex69_p2p1" EXTRA_OPTIONS="-dm_refine 4 -snes_monitor -ksp_monitor_true_residual -ksp_converged_reason -fieldsplit_pressure_ksp_converged_reason" 8280 SNES Function norm 8.969202737759e-03 829 Linear fieldsplit_pressure_ solve converged due to CONVERGED_RTOL iterations 6 830 0 KSP preconditioned resid norm 3.322375727167e+00 true resid norm 8.969202737759e-03 ||r(i)||/||b|| 1.000000000000e+00 831 Linear fieldsplit_pressure_ solve converged due to CONVERGED_RTOL iterations 6 832 1 KSP preconditioned resid norm 6.112282404006e-10 true resid norm 8.543800889926e-14 ||r(i)||/||b|| 9.525708292843e-12 833Linear solve converged due to CONVERGED_RTOL iterations 1 8341 SNES Function norm 8.543893996362e-14 835``` 836 837Starting off with an exact solver allows us to check that the discretization, equations, and boundary conditions are correct. Moreover, choosing the Schur complement formulation, rather than a sparse direct solve, gives us a path to incremental boost the scalability. Our first step will be to replace the direct solve of the momentum operator, which has cost superlinear in $N$, with a more scalable alternative. Since the operator is still elliptic, despite the viscosity variation, we should be able to use some form of multigrid. We will start with algebraic multigrid because it handles coefficient variation well, even if the setup time is larger than the geometric variant. 838 839```console 840$ make -f ./gmakefile test search="snes_tutorials-ex69_p2p1" EXTRA_OPTIONS="-dm_refine 2 -snes_monitor -ksp_monitor_true_residual -ksp_converged_reason -fieldsplit_pressure_ksp_converged_reason -fieldsplit_velocity_pc_type gamg -fieldsplit_velocity_ksp_converged_reason" 8410 SNES Function norm 3.503062983054e-02 842 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 8 843 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 9 844 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 9 845 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 9 846 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 10 847 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 9 848 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 9 849 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 8 850 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 8 851 Linear fieldsplit_pressure_ solve converged due to CONVERGED_RTOL iterations 8 852 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 9 853 0 KSP preconditioned resid norm 9.943097452179e-01 true resid norm 3.503062983054e-02 ||r(i)||/||b|| 1.000000000000e+00 854 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 8 855 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 9 856 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 9 857 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 9 858 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 10 859 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 9 860 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 9 861 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 8 862 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 8 863 Linear fieldsplit_pressure_ solve converged due to CONVERGED_RTOL iterations 8 864 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 9 865 1 KSP preconditioned resid norm 1.503326145261e-05 true resid norm 1.089276827085e-06 ||r(i)||/||b|| 3.109498265814e-05 866 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 10 867 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 9 868 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 9 869 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 9 870 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 9 871 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 9 872 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 9 873 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 9 874 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 9 875 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 9 876 Linear fieldsplit_pressure_ solve converged due to CONVERGED_RTOL iterations 9 877 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 10 878 2 KSP preconditioned resid norm 1.353007845554e-10 true resid norm 6.056095141823e-11 ||r(i)||/||b|| 1.728799959098e-09 879Linear solve converged due to CONVERGED_RTOL iterations 2 8801 SNES Function norm 6.056096909907e-11 881``` 882 883This looks alright, but the number of iterates grows with refinement. At 3 refinements, it is 16, 30 at 4 refinements, and 70 at 5 refinements. Increasing the number of smoother iterates to four, `-fieldsplit_velocity_mg_levels_ksp_max_it 4`, brings down the number of iterates, but not the growth. Using w-cycles and full multigrid does not help either. It is likely that the coarse grids made by MIS are inaccurate for the $P_2$ discretization. 884 885We can instead use geometric multigrid, and we would hope get more accurate coarse bases. The `-dm_refine_hierarchy` allows us to make a hierarchy of refined meshes and sets the number of multigrid levels automatically. Then all we need to specify is `-fieldsplit_velocity_pc_type mg`, as we see in the test 886 887```{literalinclude} /../src/snes/tutorials/ex69.c 888:end-before: 'test:' 889:start-at: 'suffix: p2p1_gmg' 890``` 891 892This behaves well for the initial mesh, 893 894```console 895$ make -f ./gmakefile test search="snes_tutorials-ex69_p2p1_gmg" EXTRA_OPTIONS="-dm_refine_hierarchy 2 -snes_monitor -ksp_monitor_true_residual -ksp_converged_reason -fieldsplit_pressure_ksp_converged_reason -fieldsplit_velocity_ksp_converged_reason" 8960 SNES Function norm 3.503062983054e-02 897 0 KSP unpreconditioned resid norm 3.503062983054e-02 true resid norm 3.503062983054e-02 ||r(i)||/||b|| 1.000000000000e+00 898 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 4 899 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 4 900 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 4 901 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 4 902 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 4 903 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 5 904 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 5 905 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 4 906 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 4 907 Linear fieldsplit_pressure_ solve converged due to CONVERGED_RTOL iterations 8 908 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 4 909 1 KSP unpreconditioned resid norm 4.643855168829e-06 true resid norm 4.643855168807e-06 ||r(i)||/||b|| 1.325655630878e-04 910 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 5 911 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 5 912 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 4 913 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 4 914 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 4 915 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 4 916 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 4 917 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 4 918 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 4 919 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 5 920 Linear fieldsplit_pressure_ solve converged due to CONVERGED_RTOL iterations 9 921 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 5 922 2 KSP unpreconditioned resid norm 1.520240889941e-11 true resid norm 1.520239396618e-11 ||r(i)||/||b|| 4.339743258890e-10 923Linear solve converged due to CONVERGED_ATOL iterations 2 9241 SNES Function norm 1.520237877998e-11 925``` 926 927and is also stable under refinement 928 929```console 930$ make -f ./gmakefile test search="snes_tutorials-ex69_p2p1_gmg" EXTRA_OPTIONS="-dm_refine_hierarchy 4 -snes_monitor -ksp_monitor_true_residual -ksp_converged_reason -fieldsplit_pressure_ksp_converged_reason -fieldsplit_velocity_ksp_converged_reason" 9310 SNES Function norm 3.503062983054e-02 932 0 KSP unpreconditioned resid norm 3.503062983054e-02 true resid norm 3.503062983054e-02 ||r(i)||/||b|| 1.000000000000e+00 933 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 4 934 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 4 935 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 4 936 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 4 937 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 4 938 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 5 939 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 5 940 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 4 941 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 4 942 Linear fieldsplit_pressure_ solve converged due to CONVERGED_RTOL iterations 8 943 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 4 944 1 KSP unpreconditioned resid norm 4.643855168829e-06 true resid norm 4.643855168807e-06 ||r(i)||/||b|| 1.325655630878e-04 945 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 5 946 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 5 947 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 4 948 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 4 949 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 4 950 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 4 951 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 4 952 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 4 953 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 4 954 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 5 955 Linear fieldsplit_pressure_ solve converged due to CONVERGED_RTOL iterations 9 956 Linear fieldsplit_velocity_ solve converged due to CONVERGED_RTOL iterations 5 957 2 KSP unpreconditioned resid norm 1.520240889941e-11 true resid norm 1.520239396618e-11 ||r(i)||/||b|| 4.339743258890e-10 958Linear solve converged due to CONVERGED_ATOL iterations 2 9591 SNES Function norm 1.520237877998e-11 960``` 961 962Finally, we can back off the pressure solve. `ILU(0)` is good enough to maintain a constant number of iterates as we refine the grid. We could continue to refine our preconditioner by playing with the tolerance of the inner multigrid and Schur complement solves, trading fewer inner iterates for more outer iterates. 963 964```console 965$ make -f ./gmakefile test search="snes_tutorials-ex69_p2p1_gmg" EXTRA_OPTIONS="-dm_refine_hierarchy 2 -snes_monitor -ksp_monitor_true_residual -ksp_converged_reason -fieldsplit_pressure_ksp_converged_reason -fieldsplit_pressure_pc_type ilu" 9660 SNES Function norm 3.503062983054e-02 967 0 KSP unpreconditioned resid norm 3.503062983054e-02 true resid norm 3.503062983054e-02 ||r(i)||/||b|| 1.000000000000e+00 968 Linear fieldsplit_pressure_ solve converged due to CONVERGED_RTOL iterations 10 969 1 KSP unpreconditioned resid norm 4.643855785779e-06 true resid norm 4.643855785812e-06 ||r(i)||/||b|| 1.325655807011e-04 970 Linear fieldsplit_pressure_ solve converged due to CONVERGED_RTOL iterations 10 971 2 KSP unpreconditioned resid norm 1.521944777036e-11 true resid norm 1.521942998859e-11 ||r(i)||/||b|| 4.344606437913e-10 972Linear solve converged due to CONVERGED_ATOL iterations 2 9731 SNES Function norm 1.521943449163e-11 974$ make -f ./gmakefile test search="snes_tutorials-ex69_p2p1_gmg" EXTRA_OPTIONS="-dm_refine_hierarchy 4 -snes_monitor -ksp_monitor_true_residual -ksp_converged_reason -fieldsplit_pressure_ksp_converged_reason -fieldsplit_pressure_pc_type ilu" 9750 SNES Function norm 8.969202737759e-03 976 0 KSP unpreconditioned resid norm 8.969202737759e-03 true resid norm 8.969202737759e-03 ||r(i)||/||b|| 1.000000000000e+00 977 Linear fieldsplit_pressure_ solve converged due to CONVERGED_RTOL iterations 10 978 1 KSP unpreconditioned resid norm 2.234849111673e-05 true resid norm 2.234849111674e-05 ||r(i)||/||b|| 2.491692045566e-03 979 Linear fieldsplit_pressure_ solve converged due to CONVERGED_RTOL iterations 10 980 2 KSP unpreconditioned resid norm 1.205594722917e-10 true resid norm 1.205594316079e-10 ||r(i)||/||b|| 1.344148807121e-08 981 Linear fieldsplit_pressure_ solve converged due to CONVERGED_RTOL iterations 10 982 3 KSP unpreconditioned resid norm 1.461086575333e-15 true resid norm 2.284323415523e-15 ||r(i)||/||b|| 2.546852247977e-13 983Linear solve converged due to CONVERGED_ATOL iterations 3 9841 SNES Function norm 2.317901194143e-15 985$ make -f ./gmakefile test search="snes_tutorials-ex69_p2p1_gmg" EXTRA_OPTIONS="-dm_refine_hierarchy 6 -snes_monitor -ksp_monitor_true_residual -ksp_converged_reason -fieldsplit_pressure_ksp_converged_reason -fieldsplit_pressure_pc_type ilu" 9860 SNES Function norm 2.252260693635e-03 987 0 KSP unpreconditioned resid norm 2.252260693635e-03 true resid norm 2.252260693635e-03 ||r(i)||/||b|| 1.000000000000e+00 988 Linear fieldsplit_pressure_ solve converged due to CONVERGED_RTOL iterations 9 989 1 KSP unpreconditioned resid norm 1.220195757583e-05 true resid norm 1.220195757579e-05 ||r(i)||/||b|| 5.417648858445e-03 990 Linear fieldsplit_pressure_ solve converged due to CONVERGED_RTOL iterations 10 991 2 KSP unpreconditioned resid norm 2.683367607036e-09 true resid norm 2.683367591382e-09 ||r(i)||/||b|| 1.191410745197e-06 992 Linear fieldsplit_pressure_ solve converged due to CONVERGED_RTOL iterations 10 993 3 KSP unpreconditioned resid norm 5.510932827474e-13 true resid norm 5.511665167379e-13 ||r(i)||/||b|| 2.447170162386e-10 994Linear solve converged due to CONVERGED_ATOL iterations 3 9951 SNES Function norm 5.511916500930e-13 996``` 997 998We can make the problem harder by increasing the wave number and size of the viscosity perturbation. If we set the $B$ parameter to 6.9, we have a factor of one million increase in viscosity across the cell. At this scale, we see that we lose enough accuracy in our Jacobian calculation to defeat our Taylor test, but we are still able to solve the problem efficiently. 999 1000```console 1001$ make -f ./gmakefile test search="snes_tutorials-ex69_p2p1_gmg" EXTRA_OPTIONS="-dm_refine_hierarchy 2 -snes_monitor -ksp_monitor_true_residual -ksp_converged_reason -fieldsplit_pressure_ksp_converged_reason -fieldsplit_pressure_pc_type ilu -m 2 -n 2 -B 6.9" 1002L_2 Error: [4.07817e-06, 0.0104694] 1003L_2 Residual: 0.0145403 1004Taylor approximation converging at order 1.00 10050 SNES Function norm 3.421266970274e-02 1006 0 KSP unpreconditioned resid norm 3.421266970274e-02 true resid norm 3.421266970274e-02 ||r(i)||/||b|| 1.000000000000e+00 1007 Linear fieldsplit_pressure_ solve converged due to CONVERGED_RTOL iterations 21 1008 1 KSP unpreconditioned resid norm 2.066264276201e-05 true resid norm 2.066264276201e-05 ||r(i)||/||b|| 6.039471032675e-04 1009 Linear fieldsplit_pressure_ solve converged due to CONVERGED_RTOL iterations 20 1010 2 KSP unpreconditioned resid norm 1.295461366009e-10 true resid norm 1.295461419342e-10 ||r(i)||/||b|| 3.786496144842e-09 1011 Linear fieldsplit_pressure_ solve converged due to CONVERGED_RTOL iterations 20 1012 3 KSP unpreconditioned resid norm 1.954355290546e-15 true resid norm 1.954135246291e-15 ||r(i)||/||b|| 5.711729786858e-14 1013Linear solve converged due to CONVERGED_ATOL iterations 3 10141 SNES Function norm 1.946196473520e-15 1015$ make -f ./gmakefile test search="snes_tutorials-ex69_p2p1_gmg" EXTRA_OPTIONS="-dm_refine_hierarchy 6 -snes_monitor -ksp_monitor_true_residual -ksp_converged_reason -fieldsplit_pressure_ksp_converged_reason -fieldsplit_pressure_pc_type ilu -m 2 -n 2 -B 6.9" 1016L_2 Error: [1.52905e-09, 4.72606e-05] 1017L_2 Residual: 7.18836e-06 1018Taylor approximation converging at order 1.00 10190 SNES Function norm 2.252034794902e-03 1020 0 KSP unpreconditioned resid norm 2.252034794902e-03 true resid norm 2.252034794902e-03 ||r(i)||/||b|| 1.000000000000e+00 1021 Linear fieldsplit_pressure_ solve converged due to CONVERGED_RTOL iterations 19 1022 1 KSP unpreconditioned resid norm 1.843225742581e-05 true resid norm 1.843225742582e-05 ||r(i)||/||b|| 8.184712539768e-03 1023 Linear fieldsplit_pressure_ solve converged due to CONVERGED_RTOL iterations 19 1024 2 KSP unpreconditioned resid norm 1.410472862037e-09 true resid norm 1.410472860342e-09 ||r(i)||/||b|| 6.263104209294e-07 1025 Linear fieldsplit_pressure_ solve converged due to CONVERGED_RTOL iterations 19 1026 3 KSP unpreconditioned resid norm 1.051996270409e-14 true resid norm 1.064465321443e-14 ||r(i)||/||b|| 4.726682393419e-12 1027Linear solve converged due to CONVERGED_ATOL iterations 3 10281 SNES Function norm 1.063917948054e-14 1029``` 1030 1031## Bibliography 1032 1033```{eval-rst} 1034.. bibliography:: /petsc.bib 1035 :filter: docname in docnames 1036``` 1037