1# Theory and Background 2 3HONEE solves the time-dependent Navier-Stokes equations of compressible gas dynamics in a static Eulerian three-dimensional frame using unstructured high-order finite/spectral element spatial discretizations and explicit or implicit high-order time-stepping (available in PETSc). 4Moreover, the Navier-Stokes example has been developed using PETSc, so that the pointwise physics (defined at quadrature points) is separated from the parallelization and meshing concerns. 5 6## The Navier-Stokes Equations 7 8The mathematical formulation (from {cite}`shakib1991femcfd`) is given in what follows. 9The compressible Navier-Stokes equations in conservative form are 10 11$$ 12\begin{aligned} 13\frac{\partial \rho}{\partial t} + \nabla \cdot \bm{U} &= 0 \\ 14\frac{\partial \bm{U}}{\partial t} + \nabla \cdot \left( \frac{\bm{U} \otimes \bm{U}}{\rho} + P \bm{I}_3 -\bm\sigma \right) - \rho \bm{b} &= 0 \\ 15\frac{\partial E}{\partial t} + \nabla \cdot \left( \frac{(E + P)\bm{U}}{\rho} -\bm{u} \cdot \bm{\sigma} - k \nabla T \right) - \rho \bm{b} \cdot \bm{u} &= 0 \, , \\ 16\end{aligned} 17$$ (eq-ns) 18 19where $\bm{\sigma} = \mu(\nabla \bm{u} + (\nabla \bm{u})^T + \lambda (\nabla \cdot \bm{u})\bm{I}_3)$ is the Cauchy (symmetric) stress tensor, with $\mu$ the dynamic viscosity coefficient, and $\lambda = - 2/3$ the Stokes hypothesis constant. 20In equations {eq}`eq-ns`, $\rho$ represents the volume mass density, $U$ the momentum density (defined as $\bm{U}=\rho \bm{u}$, where $\bm{u}$ is the vector velocity field), $E$ the total energy density (defined as $E = \rho e$, where $e$ is the total energy including thermal and kinetic but not potential energy), $\bm{I}_3$ represents the $3 \times 3$ identity matrix, $\bm{b}$ is a body force vector (e.g., gravity vector $\bm{g}$), $k$ the thermal conductivity constant, $T$ represents the temperature, and $P$ the pressure, given by the following equation of state 21 22$$ 23P = \left( {c_p}/{c_v} -1\right) \left( E - {\bm{U}\cdot\bm{U}}/{(2 \rho)} \right) \, , 24$$ (eq-state) 25 26where $c_p$ is the specific heat at constant pressure and $c_v$ is the specific heat at constant volume (that define $\gamma = c_p / c_v$, the specific heat ratio). 27 28The system {eq}`eq-ns` can be rewritten in vector form 29 30$$ 31\frac{\partial \bm{q}}{\partial t} + \nabla \cdot \bm{F}(\bm{q}) -S(\bm{q}) = 0 \, , 32$$ (eq-vector-ns) 33 34for the state variables 5-dimensional vector 35 36$$ 37\bm{q} = 38\begin{pmatrix} 39 \rho \\ 40 \bm{U} \equiv \rho \bm{ u }\\ 41 E \equiv \rho e 42\end{pmatrix} 43\begin{array}{l} 44 \leftarrow\textrm{ volume mass density}\\ 45 \leftarrow\textrm{ momentum density}\\ 46 \leftarrow\textrm{ energy density} 47\end{array} 48$$ 49 50where the flux and the source terms, respectively, are given by 51 52$$ 53\begin{aligned} 54\bm{F}(\bm{q}) &= 55\underbrace{\begin{pmatrix} 56 \bm{U}\\ 57 {(\bm{U} \otimes \bm{U})}/{\rho} + P \bm{I}_3 \\ 58 {(E + P)\bm{U}}/{\rho} 59\end{pmatrix}}_{\bm F_{\text{adv}}} + 60\underbrace{\begin{pmatrix} 610 \\ 62- \bm{\sigma} \\ 63 - \bm{u} \cdot \bm{\sigma} - k \nabla T 64\end{pmatrix}}_{\bm F_{\text{diff}}},\\ 65S(\bm{q}) &= 66 \begin{pmatrix} 67 0\\ 68 \rho \bm{b}\\ 69 \rho \bm{b}\cdot \bm{u} 70\end{pmatrix}. 71\end{aligned} 72$$ (eq-ns-flux) 73 74### Finite Element Formulation (Spatial Discretization) 75 76Let the discrete solution be 77 78$$ 79\bm{q}_N (\bm{x},t)^{(e)} = \sum_{k=1}^{P}\psi_k (\bm{x})\bm{q}_k^{(e)} 80$$ 81 82with $P=p+1$ the number of nodes in the element $e$. 83We use tensor-product bases $\psi_{kji} = h_i(X_0)h_j(X_1)h_k(X_2)$. 84 85To obtain a finite element discretization, we first multiply the strong form {eq}`eq-vector-ns` by a test function $\bm v \in H^1(\Omega)$ and integrate, 86 87$$ 88\int_{\Omega} \bm v \cdot \left(\frac{\partial \bm{q}_N}{\partial t} + \nabla \cdot \bm{F}(\bm{q}_N) - \bm{S}(\bm{q}_N) \right) \,dV = 0 \, , \; \forall \bm v \in \mathcal{V}_p\,, 89$$ 90 91with $\mathcal{V}_p = \{ \bm v(\bm x) \in H^{1}(\Omega_e) \,|\, \bm v(\bm x_e(\bm X)) \in P_p(\bm{I}), e=1,\ldots,N_e \}$ a mapped space of polynomials containing at least polynomials of degree $p$ (with or without the higher mixed terms that appear in tensor product spaces). 92 93Integrating by parts on the divergence term, we arrive at the weak form, 94 95$$ 96\begin{aligned} 97\int_{\Omega} \bm v \cdot \left( \frac{\partial \bm{q}_N}{\partial t} - \bm{S}(\bm{q}_N) \right) \,dV 98- \int_{\Omega} \nabla \bm v \!:\! \bm{F}(\bm{q}_N)\,dV & \\ 99+ \int_{\partial \Omega} \bm v \cdot \bm{F}(\bm q_N) \cdot \widehat{\bm{n}} \,dS 100 &= 0 \, , \; \forall \bm v \in \mathcal{V}_p \,, 101\end{aligned} 102$$ (eq-weak-vector-ns) 103 104where $\bm{F}(\bm q_N) \cdot \widehat{\bm{n}}$ is typically replaced with a boundary condition. 105 106:::{note} 107The notation $\nabla \bm v \!:\! \bm F$ represents contraction over both fields and spatial dimensions while a single dot represents contraction in just one, which should be clear from context, e.g., $\bm v \cdot \bm S$ contracts over fields while $\bm F \cdot \widehat{\bm n}$ contracts over spatial dimensions. 108::: 109 110### Time Discretization 111<!-- TODO: This should be reframed in terms of PETSc TS's F(t, u, \dot u) = G(t, u) rather than specifically about Explicit RK --> 112For the time discretization, we use two types of time stepping schemes through PETSc. 113 114 115#### Explicit Time-Stepping Method 116<!-- TODO: This should talk about the mass operator and the options associated with it (i.e. lumped mass matrix )--> 117The following explicit formulation is solved with the adaptive Runge-Kutta-Fehlberg (RKF4-5) method by default (any explicit time-stepping scheme available in PETSc can be chosen at runtime) 118 119$$ 120\bm{q}_N^{n+1} = \bm{q}_N^n + \Delta t \sum_{i=1}^{s} b_i k_i \, , 121$$ 122 123where 124 125$$ 126\begin{aligned} 127 k_1 &= f(t^n, \bm{q}_N^n)\\ 128 k_2 &= f(t^n + c_2 \Delta t, \bm{q}_N^n + \Delta t (a_{21} k_1))\\ 129 k_3 &= f(t^n + c_3 \Delta t, \bm{q}_N^n + \Delta t (a_{31} k_1 + a_{32} k_2))\\ 130 \vdots&\\ 131 k_i &= f\left(t^n + c_i \Delta t, \bm{q}_N^n + \Delta t \sum_{j=1}^s a_{ij} k_j \right)\\ 132\end{aligned} 133$$ 134 135and with 136 137$$ 138f(t^n, \bm{q}_N^n) = - [\nabla \cdot \bm{F}(\bm{q}_N)]^n + [S(\bm{q}_N)]^n \, . 139$$ 140 141#### Implicit Time-Stepping Method 142 143This time stepping method which can be selected using the option `-implicit` is solved with Backward Differentiation Formula (BDF) method by default (similarly, any implicit time-stepping scheme available in PETSc can be chosen at runtime). 144The implicit formulation solves nonlinear systems for $\bm q_N$: 145 146$$ 147\bm f(\bm q_N) \equiv \bm g(t^{n+1}, \bm{q}_N, \bm{\dot{q}}_N) = 0 \, , 148$$ (eq-ts-implicit-ns) 149 150where the time derivative $\bm{\dot q}_N$ is defined by 151 152$$ 153\bm{\dot{q}}_N(\bm q_N) = \alpha \bm q_N + \bm z_N 154$$ 155 156in terms of $\bm z_N$ from prior state and $\alpha > 0$, both of which depend on the specific time integration scheme (backward difference formulas, generalized alpha, implicit Runge-Kutta, etc.). 157Each nonlinear system {eq}`eq-ts-implicit-ns` will correspond to a weak form, as explained below. 158In determining how difficult a given problem is to solve, we consider the Jacobian of {eq}`eq-ts-implicit-ns`, 159 160$$ 161\frac{\partial \bm f}{\partial \bm q_N} = \frac{\partial \bm g}{\partial \bm q_N} + \alpha \frac{\partial \bm g}{\partial \bm{\dot q}_N}. 162$$ 163 164The scalar "shift" $\alpha$ scales inversely with the time step $\Delta t$, so small time steps result in the Jacobian being dominated by the second term, which is a sort of "mass matrix", and typically well-conditioned independent of grid resolution with a simple preconditioner (such as Jacobi). 165In contrast, the first term dominates for large time steps, with a condition number that grows with the diameter of the domain and polynomial degree of the approximation space. 166Both terms are significant for time-accurate simulation and the setup costs of strong preconditioners must be balanced with the convergence rate of Krylov methods using weak preconditioners. 167 168More details of PETSc's time stepping solvers can be found in the [TS User Guide](https://petsc.org/release/docs/manual/ts/). 169 170### Stabilization 171We solve {eq}`eq-weak-vector-ns` using a Galerkin discretization (default) or a stabilized method, as is necessary for most real-world flows. 172 173Galerkin methods produce oscillations for transport-dominated problems (any time the cell Péclet number is larger than 1), and those tend to blow up for nonlinear problems such as the Euler equations and (low-viscosity/poorly resolved) Navier-Stokes, in which case stabilization is necessary. 174Our formulation follows {cite}`hughesetal2010`, which offers a comprehensive review of stabilization and shock-capturing methods for continuous finite element discretization of compressible flows. 175 176- **SUPG** (streamline-upwind/Petrov-Galerkin) 177 178 In this method, the weighted residual of the strong form {eq}`eq-vector-ns` is added to the Galerkin formulation {eq}`eq-weak-vector-ns`. 179 The weak form for this method is given as 180 181 $$ 182 \begin{aligned} 183 \int_{\Omega} \bm v \cdot \left( \frac{\partial \bm{q}_N}{\partial t} - \bm{S}(\bm{q}_N) \right) \,dV 184 - \int_{\Omega} \nabla \bm v \!:\! \bm{F}(\bm{q}_N)\,dV & \\ 185 + \int_{\partial \Omega} \bm v \cdot \bm{F}(\bm{q}_N) \cdot \widehat{\bm{n}} \,dS & \\ 186 + \int_{\Omega} \nabla\bm v \tcolon\left(\frac{\partial \bm F_{\text{adv}}}{\partial \bm q}\right) \bm\tau \left( \frac{\partial \bm{q}_N}{\partial t} \, + \, 187 \nabla \cdot \bm{F} \, (\bm{q}_N) - \bm{S}(\bm{q}_N) \right) \,dV &= 0 188 \, , \; \forall \bm v \in \mathcal{V}_p 189 \end{aligned} 190 $$ (eq-weak-vector-ns-supg) 191 192 This stabilization technique can be selected using the option `-stab supg`. 193 194- **SU** (streamline-upwind) 195 196 This method is a simplified version of *SUPG* {eq}`eq-weak-vector-ns-supg` which is developed for debugging/comparison purposes. The weak form for this method is 197 198 $$ 199 \begin{aligned} 200 \int_{\Omega} \bm v \cdot \left( \frac{\partial \bm{q}_N}{\partial t} - \bm{S}(\bm{q}_N) \right) \,dV 201 - \int_{\Omega} \nabla \bm v \!:\! \bm{F}(\bm{q}_N)\,dV & \\ 202 + \int_{\partial \Omega} \bm v \cdot \bm{F}(\bm{q}_N) \cdot \widehat{\bm{n}} \,dS & \\ 203 + \int_{\Omega} \nabla\bm v \tcolon\left(\frac{\partial \bm F_{\text{adv}}}{\partial \bm q}\right) \bm\tau \nabla \cdot \bm{F} \, (\bm{q}_N) \,dV 204 & = 0 \, , \; \forall \bm v \in \mathcal{V}_p 205 \end{aligned} 206 $$ (eq-weak-vector-ns-su) 207 208 This stabilization technique can be selected using the option `-stab su`. 209 210In both {eq}`eq-weak-vector-ns-su` and {eq}`eq-weak-vector-ns-supg`, $\bm\tau \in \mathbb R^{5\times 5}$ (field indices) is an intrinsic time scale matrix. 211The SUPG technique and the operator $\frac{\partial \bm F_{\text{adv}}}{\partial \bm q}$ (rather than its transpose) can be explained via an ansatz for subgrid state fluctuations $\tilde{\bm q} = -\bm\tau \bm r$ where $\bm r$ is a strong form residual. 212The forward variational form can be readily expressed by differentiating $\bm F_{\text{adv}}$ of {eq}`eq-ns-flux` 213 214$$ 215\begin{aligned} 216\diff\bm F_{\text{adv}}(\diff\bm q; \bm q) &= \frac{\partial \bm F_{\text{adv}}}{\partial \bm q} \diff\bm q \\ 217&= \begin{pmatrix} 218\diff\bm U \\ 219(\diff\bm U \otimes \bm U + \bm U \otimes \diff\bm U)/\rho - (\bm U \otimes \bm U)/\rho^2 \diff\rho + \diff P \bm I_3 \\ 220(E + P)\diff\bm U/\rho + (\diff E + \diff P)\bm U/\rho - (E + P) \bm U/\rho^2 \diff\rho 221\end{pmatrix}, 222\end{aligned} 223$$ 224 225where $\diff P$ is defined by differentiating {eq}`eq-state`. 226 227:::{dropdown} Stabilization scale $\bm\tau$ 228A velocity vector $\bm u$ can be pulled back to the reference element as $\bm u_{\bm X} = \nabla_{\bm x}\bm X \cdot \bm u$, with units of reference length (non-dimensional) per second. 229To build intuition, consider a boundary layer element of dimension $(1, \epsilon)$, for which $\nabla_{\bm x} \bm X = \bigl(\begin{smallmatrix} 2 & \\ & 2/\epsilon \end{smallmatrix}\bigr)$. 230So a small normal component of velocity will be amplified (by a factor of the aspect ratio $1/\epsilon$) in this transformation. 231The ratio $\lVert \bm u \rVert / \lVert \bm u_{\bm X} \rVert$ is a covariant measure of (half) the element length in the direction of the velocity. 232A contravariant measure of element length in the direction of a unit vector $\hat{\bm n}$ is given by $\lVert \bigl(\nabla_{\bm X} \bm x\bigr)^T \hat{\bm n} \rVert$. 233While $\nabla_{\bm X} \bm x$ is readily computable, its inverse $\nabla_{\bm x} \bm X$ is needed directly in finite element methods and thus more convenient for our use. 234If we consider a parallelogram, the covariant measure is larger than the contravariant measure for vectors pointing between acute corners and the opposite holds for vectors between oblique corners. 235 236The cell Péclet number is classically defined by $\mathrm{Pe}_h = \lVert \bm u \rVert h / (2 \kappa)$ where $\kappa$ is the diffusivity (units of $m^2/s$). 237This can be generalized to arbitrary grids by defining the local Péclet number 238 239$$ 240\mathrm{Pe} = \frac{\lVert \bm u \rVert^2}{\lVert \bm u_{\bm X} \rVert \kappa}. 241$$ (eq-peclet) 242 243For scalar advection-diffusion, the stabilization is a scalar 244 245$$ 246\tau = \frac{\xi(\mathrm{Pe})}{\lVert \bm u_{\bm X} \rVert}, 247$$ (eq-tau-advdiff) 248 249where $\xi(\mathrm{Pe}) = \coth \mathrm{Pe} - 1/\mathrm{Pe}$ approaches 1 at large local Péclet number. 250Note that $\tau$ has units of time and, in the transport-dominated limit, is proportional to element transit time in the direction of the propagating wave. 251For advection-diffusion, $\bm F(q) = \bm u q$, and thus the SU stabilization term is 252 253$$ 254\nabla v \cdot \bm u \tau \bm u \cdot \nabla q = \nabla_{\bm X} v \cdot (\bm u_{\bm X} \tau \bm u_{\bm X}) \cdot \nabla_{\bm X} q . 255$$ (eq-su-stabilize-advdiff) 256 257where the term in parentheses is a rank-1 diffusivity tensor that has been pulled back to the reference element. 258See {cite}`hughesetal2010` equations 15-17 and 34-36 for further discussion of this formulation. 259 260#### Navier-Stokes $\tau$ definition 261 262For the Navier-Stokes and Euler equations, {cite}`whiting2003hierarchical` defines a $5\times 5$ diagonal stabilization $\mathrm{diag}(\tau_c, \tau_m, \tau_m, \tau_m, \tau_E)$ consisting of 2631. continuity stabilization $\tau_c$ 2642. momentum stabilization $\tau_m$ 2653. energy stabilization $\tau_E$ 266 267The Navier-Stokes code in this example uses the following formulation for $\tau_c$, $\tau_m$, $\tau_E$: 268 269$$ 270\begin{aligned} 271 272\tau_c &= \frac{C_c \mathcal{F}}{8\rho \trace(\bm g)} \\ 273\tau_m &= \frac{C_m}{\mathcal{F}} \\ 274\tau_E &= \frac{C_E}{\mathcal{F} c_v} \\ 275\end{aligned} 276$$ 277 278$$ 279\mathcal{F} = \sqrt{ \rho^2 \left [ \left(\frac{2C_t}{\Delta t}\right)^2 280+ \bm u \cdot (\bm u \cdot \bm g)\right] 281+ C_v \mu^2 \Vert \bm g \Vert_F ^2} 282$$ 283 284where $\bm g = \nabla_{\bm x} \bm{X}^T \cdot \nabla_{\bm x} \bm{X}$ is the metric tensor and $\Vert \cdot \Vert_F$ is the Frobenius norm. 285This formulation is currently not available in the Euler code. 286 287#### Advection-Diffusion $\tau$ definition 288 289For Advection-Diffusion, we first examine a 1D definition given by: 290 291$$ 292 \tau = \textrm{minreg}_2 \left\{\frac{\Delta t}{2 C_t},\ \frac{h}{aC_a}, \ \frac{h^2}{\kappa C_d} \right\} 293$$ 294 295for $C_t$, $C_a$, $C_d$ being some scaling coefficients, $h$ is the element length, and $\textrm{minreg}_n \{x_j\} = (\sum_j x_j^{-n})^{-1/n}$. 296To make this definition compatible with higher dimensional domains, we use a similar system from the Navier-Stokes equations. 297This results in the following definition: 298 299$$ 300\begin{aligned} 301 \tau &= \textrm{minreg}_2 \left \{ \frac{\Delta t}{2 C_t}, \frac{1}{C_a \sqrt{\bm u \cdot (\bm u \cdot \bm g)}}, \frac{1}{C_d \kappa \Vert \bm g \Vert_F} \right\} \\ 302 &= \left [ \left(\frac{2 C_t}{\Delta t}\right)^2 + C_a^2 \bm u \cdot (\bm u \cdot \bm g) + \left(C_d \kappa\right)^2 \Vert \bm g \Vert_F^2\right]^{-1/2} 303\end{aligned} 304$$ 305 306Note that $\bm g$ is scaled so that it is identity for a unit square, keeping this definition aligned with the traditional 1D definition, which uses the element length directly. 307The scaling coefficients are set via `-Ctau_t`, `-Ctau_a`, and `-Ctau_d`, respectively. 308 309#### Euler $\tau$ definition 310In the Euler code, we follow {cite}`hughesetal2010` in defining a $3\times 3$ diagonal stabilization according to spatial criterion 2 (equation 27) as follows. 311 312$$ 313\tau_{ii} = c_{\tau} \frac{2 \xi(\mathrm{Pe})}{(\lambda_{\max \text{abs}})_i \lVert \nabla_{x_i} \bm X \rVert} 314$$ (eq-tau-conservative) 315 316where $c_{\tau}$ is a multiplicative constant reported to be optimal at 0.5 for linear elements, $\hat{\bm n}_i$ is a unit vector in direction $i$, and $\nabla_{x_i} = \hat{\bm n}_i \cdot \nabla_{\bm x}$ is the derivative in direction $i$. 317The flux Jacobian $\frac{\partial \bm F_{\text{adv}}}{\partial \bm q} \cdot \hat{\bm n}_i$ in each direction $i$ is a $5\times 5$ matrix with spectral radius $(\lambda_{\max \text{abs}})_i$ equal to the fastest wave speed. 318The complete set of eigenvalues of the Euler flux Jacobian in direction $i$ are (e.g., {cite}`toro2009`) 319 320$$ 321\Lambda_i = [u_i - a, u_i, u_i, u_i, u_i+a], 322$$ (eq-eigval-advdiff) 323 324where $u_i = \bm u \cdot \hat{\bm n}_i$ is the velocity component in direction $i$ and $a = \sqrt{\gamma P/\rho}$ is the sound speed for ideal gasses. 325Note that the first and last eigenvalues represent nonlinear acoustic waves while the middle three are linearly degenerate, carrying a contact wave (temperature) and transverse components of momentum. 326The fastest wave speed in direction $i$ is thus 327 328$$ 329\lambda_{\max \text{abs}} \Bigl( \frac{\partial \bm F_{\text{adv}}}{\partial \bm q} \cdot \hat{\bm n}_i \Bigr) = |u_i| + a 330$$ (eq-wavespeed) 331 332Note that this wave speed is specific to ideal gases as $\gamma$ is an ideal gas parameter; other equations of state will yield a different acoustic wave speed. 333 334::: 335 336Currently, this demo provides three types of problems/physical models that can be selected at run time via the option `-problem`. 337{ref}`problem-advection`, the problem of the transport of energy in a uniform vector velocity field, {ref}`problem-euler-vortex`, the exact solution to the Euler equations, and the so called {ref}`example-density-current` problem. 338 339### Divergence of Diffusive Flux Projection 340 341The strong residual in the SUPG operator in {eq}`eq-weak-vector-ns-supg` and {eq}`eq-weak-vector-ns-su` features the term $\nabla \cdot \bm F_{\text{diff}}(\bm{q}_N)$, the divergence of the diffusive flux. 342This term requires a second derivative to evaluate; first to evaluate $\bm \sigma$ and $\nabla T$ for $F_{\text{diff}}$, the second for the divergence of the flux. 343For linear elements, the flux is constant within each element so the second derivative is zero, leading to accuracy issues. 344Additionally, libCEED does not currently support calculating double-derivatives. 345To circumvent these issues, we (optionally) perform a projection operation to get $\nabla \cdot \bm F_{\text{diff}}(\bm{q}_N)$ at quadrature points. 346This was first proposed in {cite}`jansenDiffFluxProjection1999`. 347There are two methods of achieving this implemented in HONEE, denoted as the direct and indirect methods. 348 349#### Indirect Projection 350 351Indirect projection is the method presented in {cite}`jansenDiffFluxProjection1999`. 352Here, $\bm F_{\text{diff}}$ is $L^2$ projected onto the finite element space and then the divergence is taken from that FEM function. 353For linear basis functions, this leads to constant values of $\nabla \cdot \bm F_{\text{diff}}$ within each element. 354 355For compressible Navier-Stokes, this requires projecting 12 scalars-per-node: 4 conserved scalars (mass conservation does not have a diffusive flux term) in 3 dimensional directions. 356These 12 scalar finite element functions' derivatives are then evaluated at quadrature points and the divergence is calculated. 357This method can be selected with `-div_diff_flux_projection_method indirect`. 358 359#### Direct Projection 360In the direct projection method, we perform an $L^2$ projection of the divergence of the diffusive flux itself, $\nabla \cdot \bm F_{\text{diff}}(\bm{q}_N)$. 361Then $\nabla \cdot \bm F_{\text{diff}}$ itself is a function on the finite element space and can be interpolated onto quadrature points. 362 363To do this, look at the RHS of the $L^2$ projection of $\nabla \cdot \bm F_{\text{diff}}(\bm{q}_N)$: 364 365$$ 366\int_{\Omega} \bm v \cdot \nabla \cdot \bm F_{\text{diff}}(\bm{q}_N) \,dV 367$$ 368 369As noted, we can't calculate $\nabla \cdot \bm F_{\text{diff}}(\bm{q}_N)$ at quadrature points, so we apply integration-by-parts to achieve a calculable RHS: 370 371$$ 372\int_{\partial \Omega} \bm v \cdot \bm{F}_{\text{diff}}(\bm q_N) \cdot \widehat{\bm{n}} \,dS 373- \int_{\Omega} \nabla \bm v \!:\! \bm{F}_{\text{diff}}(\bm{q}_N)\,dV 374$$ 375 376This form is what is used for calculating the RHS of the projection. 377After the projection, $\nabla \cdot \bm F_{\text{diff}}(\bm{q}_N)$ is interpolated directly to quadrature points without any extra calculations necessary. 378For compressible Navier-Stokes, this means only projecting only 4 scalars-per-node. 379 380The projection can be enabled using `-div_diff_flux_projection_method direct`. 381 382#### General Information 383The $L^2$ projection in either method uses the standard mass matrix, which is rowsum lumped for performance by default. 384The linear solve for the projection can be controlled via `-div_diff_flux_projection_ksp*` flags. 385 386### Subgrid Stress Modeling 387 388When a fluid simulation is under-resolved (the smallest length scale resolved by the grid is much larger than the smallest physical scale, the [Kolmogorov length scale](https://en.wikipedia.org/wiki/Kolmogorov_microscales)), this is mathematically interpreted as filtering the Navier-Stokes equations. 389This is known as large-eddy simulation (LES), as only the "large" scales of turbulence are resolved. 390This filtering operation results in an extra stress-like term, $\bm{\tau}^r$, representing the effect of unresolved (or "subgrid" scale) structures in the flow. 391Denoting the filtering operation by $\overline \cdot$, the LES governing equations are: 392 393$$ 394\frac{\partial \bm{\overline q}}{\partial t} + \nabla \cdot \bm{\overline F}(\bm{\overline q}) -S(\bm{\overline q}) = 0 \, , 395$$ (eq-vector-les) 396 397where 398 399$$ 400\bm{\overline F}(\bm{\overline q}) = 401\bm{F} (\bm{\overline q}) + 402\begin{pmatrix} 403 0\\ 404 \bm{\tau}^r \\ 405 \bm{u} \cdot \bm{\tau}^r 406\end{pmatrix} 407$$ (eq-les-flux) 408 409More details on deriving the above expression, filtering, and large eddy simulation can be found in {cite}`popeTurbulentFlows2000`. 410To close the problem, the subgrid stress must be defined. 411For implicit LES, the subgrid stress is set to zero and the numerical properties of the discretized system are assumed to account for the effect of subgrid scale structures on the filtered solution field. 412For explicit LES, it is defined by a subgrid stress model. 413 414:::{list-table} SGS Model Options 415:header-rows: 1 416 417* - Option 418 - Description 419 - Default value 420 - Unit 421 422* - `-sgs_model_type` 423 - Type of subgrid stress model to use. Currently only `data_driven` is available 424 - `none` 425 - string 426::: 427 428(sgs-dd-model)= 429#### Data-Driven SGS Model 430 431The data-driven SGS model implemented here uses a small neural network to compute the SGS term. 432The SGS tensor is calculated at nodes using an $L^2$ projection of the velocity gradient and grid anisotropy tensor, and then interpolated onto quadrature points. 433More details regarding the theoretical background of the model can be found in {cite}`prakashDDSGS2022` and {cite}`prakashDDSGSAnisotropic2022`. 434 435The neural network itself consists of 1 hidden layer and 20 neurons, using Leaky ReLU as its activation function. 436The slope parameter for the Leaky ReLU function is set via `-sgs_model_dd_leakyrelu_alpha`. 437The outputs of the network are assumed to be normalized on a min-max scale, so they must be rescaled by the original min-max bounds. 438Parameters for the neural network are put into files in a directory found in `-sgs_model_dd_parameter_dir`. 439These files store the network weights (`w1.dat` and `w2.dat`), biases (`b1.dat` and `b2.dat`), and scaling parameters (`OutScaling.dat`). 440The first row of each files stores the number of columns and rows in each file. 441Note that the weight coefficients are assumed to be in column-major order. 442This is done to keep consistent with legacy file compatibility. 443 444:::{note} 445The data-driven model parameters in the examples directory are not accurate and are for regression testing only. 446::: 447 448##### Data-Driven Model Using External Libraries 449 450There are two different modes for using the data-driven model: fused and sequential. 451 452In fused mode, the input processing, model inference, and output handling were all done in a single CeedOperator. 453Fused mode is generally faster than the sequential mode, however fused mode requires that the model architecture be manually implemented into a libCEED QFunction. 454To use the fused mode, set `-sgs_model_dd_implementation fused`. 455 456Sequential mode has separate function calls/CeedOperators for input creation, model inference, and output handling. 457By separating the three steps of the model evaluation, the sequential mode allows for functions calling external libraries to be used for the model inference step. 458The use of these external libraries allows us to leverage the flexibility of those external libraries in their model architectures. 459 460PyTorch is currently the only external library implemented with the sequential mode. 461This is enabled with `USE_TORCH=1` during the build process, which will use the PyTorch accessible from the build environment's Python interpreter. 462To specify the path to the PyTorch model file, use `-sgs_model_dd_torch_model_path`. 463The hardware used to run the model inference is determined automatically from the libCEED backend chosen, but can be overridden with `-sgs_model_dd_torch_model_device`. 464Note that if you chose to run the inference on host while using a GPU libCEED backend (e.g. `/gpu/cuda`), then host-to-device transfers (and vice versa) will be done automatically. 465 466The sequential mode is available using a libCEED based inference evaluation via `-sgs_model_dd_implementation sequential_ceed`, but it is only for verification purposes. 467 468:::{list-table} Data-driven SGS Model Options 469:header-rows: 1 470 471* - Option 472 - Description 473 - Default value 474 - Unit 475 476* - `-sgs_model_dd_leakyrelu_alpha` 477 - Slope parameter for Leaky ReLU activation function. `0` corresponds to normal ReLU 478 - 0 479 - 480 481* - `-sgs_model_dd_parameter_dir` 482 - Path to directory with data-driven model parameters (weights, biases, etc.) 483 - `./dd_sgs_parameters` 484 - string 485 486* - `-sgs_model_dd_model_implementation` 487 - Which computational implementation to use for SGS DD model (`fused`, `sequential_ceed`, `sequential_torch`) 488 - `fused` 489 - string 490 491* - `-sgs_model_dd_torch_model_path` 492 - Path to the PyTorch `*.pt` file containing the DD inference model 493 - 494 - string 495 496* - `-sgs_model_dd_torch_model_device` 497 - What hardware to perform the model inference on (`cpu`, `cuda`, `hip`, `xpu`) 498 - Default matches the libCEED backend 499 - string 500::: 501 502 503## Boundary Conditions 504 505 506(bc-stg)= 507### Synthetic Turbulence Generation (STG) 508 509We use the STG method described in {cite}`shurSTG2014`. 510Below follows a re-description of the formulation to match the present notation, and then a description of the implementation and usage. 511 512#### Equation Formulation 513 514$$ 515\bm{u}(\bm{x}, t) = \bm{\overline{u}}(\bm{x}) + \bm{C}(\bm{x}) \cdot \bm{v}' 516$$ 517 518$$ 519\begin{aligned} 520\bm{v}' &= 2 \sqrt{3/2} \sum^N_{n=1} \sqrt{q^n(\bm{x})} \bm{\sigma}^n \cos(\kappa^n \bm{d}^n \cdot \bm{\hat{x}}^n(\bm{x}, t) + \phi^n ) \\ 521\bm{\hat{x}}^n &= \left[(x - U_0 t)\max(2\kappa_{\min}/\kappa^n, 0.1) , y, z \right]^T 522\end{aligned} 523$$ 524 525Here, we define the number of wavemodes $N$, set of random numbers $ \{\bm{\sigma}^n, \bm{d}^n, \phi^n\}_{n=1}^N$, the Cholesky decomposition of the Reynolds stress tensor $\bm{C}$ (such that $\bm{R} = \bm{CC}^T$ ), bulk velocity $U_0$, wavemode amplitude $q^n$, wavemode frequency $\kappa^n$, and $\kappa_{\min} = 0.5 \min_{\bm{x}} (\kappa_e)$. 526 527$$ 528\kappa_e = \frac{2\pi}{\min(2d_w, 3.0 l_t)} 529$$ 530 531where $l_t$ is the turbulence length scale, and $d_w$ is the distance to the nearest wall. 532 533 534The set of wavemode frequencies is defined by a geometric distribution: 535 536$$ 537\kappa^n = \kappa_{\min} (1 + \alpha)^{n-1} \ , \quad \forall n=1, 2, ... , N 538$$ 539 540The wavemode amplitudes $q^n$ are defined by a model energy spectrum $E(\kappa)$: 541 542$$ 543q^n = \frac{E(\kappa^n) \Delta \kappa^n}{\sum^N_{n=1} E(\kappa^n)\Delta \kappa^n} \ ,\quad \Delta \kappa^n = \kappa^n - \kappa^{n-1} 544$$ 545 546$$ E(\kappa) = \frac{(\kappa/\kappa_e)^4}{[1 + 2.4(\kappa/\kappa_e)^2]^{17/6}} f_\eta f_{\mathrm{cut}} $$ 547 548$$ 549f_\eta = \exp \left[-(12\kappa /\kappa_\eta)^2 \right], \quad 550f_\mathrm{cut} = \exp \left( - \left [ \frac{4\max(\kappa-0.9\kappa_\mathrm{cut}, 0)}{\kappa_\mathrm{cut}} \right]^3 \right) 551$$ 552 553$\kappa_\eta$ represents turbulent dissipation frequency, and is given as $2\pi (\nu^3/\varepsilon)^{-1/4}$ with $\nu$ the kinematic viscosity and $\varepsilon$ the turbulent dissipation. 554$\kappa_\mathrm{cut}$ approximates the effective cutoff frequency of the mesh (viewing the mesh as a filter on solution over $\Omega$) and is given by: 555 556$$ 557\kappa_\mathrm{cut} = \frac{2\pi}{ 2\min\{ [\max(h_y, h_z, 0.3h_{\max}) + 0.1 d_w], h_{\max} \} } 558$$ 559 560The enforcement of the boundary condition is identical to the blasius inflow; it weakly enforces velocity, with the option of weakly enforcing either density or temperature using the `-weakT` flag. 561 562#### Initialization Data Flow 563 564Data flow for initializing function (which creates the context data struct) is given below: 565```{mermaid} 566flowchart LR 567 subgraph STGInflow.dat 568 y 569 lt[l_t] 570 eps 571 Rij[R_ij] 572 ubar 573 end 574 575 subgraph STGRand.dat 576 rand[RN Set]; 577 end 578 579 subgraph User Input 580 u0[U0]; 581 end 582 583 subgraph init[Create Context Function] 584 ke[k_e] 585 N; 586 end 587 lt --Calc-->ke --Calc-->kn 588 y --Calc-->ke 589 590 subgraph context[Context Data] 591 yC[y] 592 randC[RN Set] 593 Cij[C_ij] 594 u0 --Copy--> u0C[U0] 595 kn[k^n]; 596 ubarC[ubar] 597 ltC[l_t] 598 epsC[eps] 599 end 600 ubar --Copy--> ubarC; 601 y --Copy--> yC; 602 lt --Copy--> ltC; 603 eps --Copy--> epsC; 604 605 rand --Copy--> randC; 606 rand --> N --Calc--> kn; 607 Rij --Calc--> Cij[C_ij] 608``` 609 610This is done once at runtime. 611The spatially-varying terms are then evaluated at each quadrature point on-the-fly, either by interpolation (for $l_t$, $\varepsilon$, $C_{ij}$, and $\overline{\bm u}$) or by calculation (for $q^n$). 612 613The `STGInflow.dat` file is a table of values at given distances from the wall. 614These values are then interpolated to a physical location (node or quadrature point). It has the following format: 615``` 616[Total number of locations] 14 617[d_w] [u_1] [u_2] [u_3] [R_11] [R_22] [R_33] [R_12] [R_13] [R_23] [sclr_1] [sclr_2] [l_t] [eps] 618``` 619where each `[ ]` item is a number in scientific notation (ie. `3.1415E0`), and `sclr_1` and `sclr_2` are reserved for turbulence modeling variables. 620They are not used in this example. 621 622The `STGRand.dat` file is the table of the random number set, $\{\bm{\sigma}^n, \bm{d}^n, \phi^n\}_{n=1}^N$. 623It has the format: 624``` 625[Number of wavemodes] 7 626[d_1] [d_2] [d_3] [phi] [sigma_1] [sigma_2] [sigma_3] 627``` 628 629The following table is presented to help clarify the dimensionality of the numerous terms in the STG formulation. 630 631| Math | Label | $f(\bm{x})$? | $f(n)$? | 632| ----------------- | -------- | -------------- | --------- | 633| $ \{\bm{\sigma}^n, \bm{d}^n, \phi^n\}_{n=1}^N$ | RN Set | No | Yes | 634| $\bm{\overline{u}}$ | ubar | Yes | No | 635| $U_0$ | U0 | No | No | 636| $l_t$ | l_t | Yes | No | 637| $\varepsilon$ | eps | Yes | No | 638| $\bm{R}$ | R_ij | Yes | No | 639| $\bm{C}$ | C_ij | Yes | No | 640| $q^n$ | q^n | Yes | Yes | 641| $\{\kappa^n\}_{n=1}^N$ | k^n | No | Yes | 642| $h_i$ | h_i | Yes | No | 643| $d_w$ | d_w | Yes | No | 644 645#### Runtime Options 646 647To use the STG boundary condition, the `-bc_inflow` option should be set to the boundary faces that need the inflow (see {ref}`bc-flags`). 648The `-stg_use` flag is then used to enable/disable applying STG to those faces. 649 650:::{list-table} STG Runtime Options 651:header-rows: 1 652 653* - Option 654 - Description 655 - Default value 656 - Unit 657 658* - `-stg_use` 659 - Enable STG for `bc_inflow` faces 660 - `false` 661 - 662 663* - `-stg_inflow_path` 664 - Path to the STGInflow file 665 - `./STGInflow.dat` 666 - 667 668* - `-stg_rand_path` 669 - Path to the STGRand file 670 - `./STGRand.dat` 671 - 672 673* - `-stg_alpha` 674 - Growth rate of the wavemodes 675 - `1.01` 676 - 677 678* - `-stg_u0` 679 - Convective velocity, $U_0$ 680 - `0.0` 681 - `m/s` 682 683* - `-stg_mean_only` 684 - Only impose the mean velocity (no fluctutations) 685 - `false` 686 - 687 688* - `-stg_strong` 689 - Strongly enforce the STG inflow boundary condition 690 - `false` 691 - 692 693* - `-stg_fluctuating_IC` 694 - "Extrude" the fluctuations through the domain as an initial condition 695 - `false` 696 - 697 698* - `-stg_dx` 699 - Set the element size in the x direction. Default is calculated for box meshes, assuming equispaced elements. 700 - 701 - `m` 702 703* - `-stg_h_scale_factor` 704 - Scale element size for cutoff frequency calculation 705 - $1/p$ 706 - 707::: 708 709### Internal Damping Layer (IDL) 710:::{note} 711IDL is not a boundary condition, but it's primary application is for use with STG. 712::: 713The STG inflow boundary condition creates large amplitude acoustic waves. 714We use an internal damping layer (IDL) to damp them out without disrupting the synthetic structures developing into natural turbulent structures. 715This implementation was inspired by {cite}`shurSTG2014`, but is implemented here as a ramped volumetric forcing term, similar to a sponge layer (see 8.4.2.4 in {cite}`colonius2023turbBC` for example). 716It takes the following form: 717 718$$ 719S(\bm{q}) = -\sigma(\bm{x})\left.\frac{\partial \bm{q}}{\partial \bm{Y}}\right\rvert_{\bm{q}} \bm{Y}' 720$$ 721 722where $\bm{Y}' = [P - P_\mathrm{ref}, \bm{0}, 0]^T$, and $\sigma(\bm{x})$ is a linear ramp starting at `-idl_start` with length `-idl_length` and an amplitude of inverse `-idl_decay_rate`. 723The damping is defined in terms of a pressure-primitive anomaly $\bm Y'$ converted to conservative source using $\partial \bm{q}/\partial \bm{Y}\rvert_{\bm{q}}$, which is linearized about the current flow state. 724$P_\mathrm{ref}$ has a default value equal to `-reference_pressure` flag, with an optional flag `-idl_pressure` to set it to a different value. 725 726:::{list-table} IDL Runtime Options 727:header-rows: 1 728 729* - Option 730 - Description 731 - Default value 732 - Unit 733 734* - `-idl_decay_time` 735 - Characteristic timescale of the pressure deviance decay. The timestep is good starting point 736 - `-1` (disabled) 737 - `s` 738 739* - `-idl_start` 740 - Start of IDL in the x direction 741 - `0` 742 - `m` 743 744* - `-idl_length` 745 - Length of IDL in the positive x direction 746 - `0` 747 - `m` 748 749* - `-idl_pressure` 750 - Pressure used for IDL reference pressure 751 - `-reference_pressure` 752 - `Pa` 753::: 754 755 756<!-- TODO: Move the Riemann/freestream description here--> 757