1*6f5dc8baSWill Pazner# Defining User Q-Functions 2*6f5dc8baSWill Pazner 3*6f5dc8baSWill PaznerAn important feature of LibCEED.jl is the ability to define [user 4*6f5dc8baSWill PaznerQ-functions](https://libceed.readthedocs.io/en/latest/libCEEDapi/#gallery-of-qfunctions) 5*6f5dc8baSWill Paznernatively in Julia. These user Q-functions work with both the CPU and CUDA 6*6f5dc8baSWill Paznerbackends. 7*6f5dc8baSWill Pazner 8*6f5dc8baSWill PaznerUser Q-functions describe the action of the $D$ operator at quadrature points 9*6f5dc8baSWill Pazner(see [libCEED's theoretical 10*6f5dc8baSWill Paznerframework](https://libceed.readthedocs.io/en/latest/libCEEDapi/#theoretical-framework)). 11*6f5dc8baSWill PaznerSince the Q-functions are invoked at every quadrature point, efficiency is 12*6f5dc8baSWill Paznervery important. 13*6f5dc8baSWill Pazner 14*6f5dc8baSWill Pazner## Apply mass Q-function in C 15*6f5dc8baSWill Pazner 16*6f5dc8baSWill PaznerBefore describing how to define user Q-functions in Julia, we will briefly given 17*6f5dc8baSWill Pazneran example of a user Q-function defined in C. This is the "apply mass" 18*6f5dc8baSWill PaznerQ-function from `ex1-volume.c`, which computes the action of the mass operator. 19*6f5dc8baSWill PaznerThe mass operator on each element can be written as $B^\intercal D B$, where $B$ 20*6f5dc8baSWill Pazneris the basis operator, and $D$ represents multiplication by quadrature weights 21*6f5dc8baSWill Paznerand geometric factors (i.e. the determinant of the mesh transformation Jacobian 22*6f5dc8baSWill Paznerat each qudarture point). It is the action of $D$ that the Q-function must 23*6f5dc8baSWill Paznerimplement. The C source of the Q-function is: 24*6f5dc8baSWill Pazner 25*6f5dc8baSWill Pazner```c 26*6f5dc8baSWill Pazner/// libCEED Q-function for applying a mass operator 27*6f5dc8baSWill PaznerCEED_QFUNCTION(f_apply_mass)(void *ctx, const CeedInt Q, 28*6f5dc8baSWill Pazner const CeedScalar *const *in, 29*6f5dc8baSWill Pazner CeedScalar *const *out) { 30*6f5dc8baSWill Pazner const CeedScalar *u = in[0], *qdata = in[1]; 31*6f5dc8baSWill Pazner CeedScalar *v = out[0]; 32*6f5dc8baSWill Pazner // Quadrature Point Loop 33*6f5dc8baSWill Pazner CeedPragmaSIMD 34*6f5dc8baSWill Pazner for (CeedInt i=0; i<Q; i++) { 35*6f5dc8baSWill Pazner v[i] = qdata[i] * u[i]; 36*6f5dc8baSWill Pazner } // End of Quadrature Point Loop 37*6f5dc8baSWill Pazner return 0; 38*6f5dc8baSWill Pazner} 39*6f5dc8baSWill Pazner``` 40*6f5dc8baSWill Pazner 41*6f5dc8baSWill PaznerFrom this example, we see that a user Q-function is a C callback that takes a 42*6f5dc8baSWill Pazner"data context" pointer, a number of quadrature points, and two arrays of arrays, 43*6f5dc8baSWill Paznerone for inputs, and one for outputs. 44*6f5dc8baSWill Pazner 45*6f5dc8baSWill PaznerIn this example, the first input array is `u`, which is the value of the trial 46*6f5dc8baSWill Paznerfunction evaluated at each quadrature point. The second input array is `qdata`, 47*6f5dc8baSWill Paznerwhich contains the precomputed geometric factors. There is only one output 48*6f5dc8baSWill Paznerarray, `v`, which will store the pointwise product of `u` and `data`. Given the 49*6f5dc8baSWill Paznerdefinition of this Q-function, the `CeedQFunction` object is created by 50*6f5dc8baSWill Pazner```c 51*6f5dc8baSWill PaznerCeedQFunctionCreateInterior(ceed, 1, f_apply_mass, f_apply_mass_loc, &apply_qfunc); 52*6f5dc8baSWill PaznerCeedQFunctionAddInput(apply_qfunc, "u", 1, CEED_EVAL_INTERP); 53*6f5dc8baSWill PaznerCeedQFunctionAddInput(apply_qfunc, "qdata", 1, CEED_EVAL_NONE); 54*6f5dc8baSWill PaznerCeedQFunctionAddOutput(apply_qfunc, "v", 1, CEED_EVAL_INTERP); 55*6f5dc8baSWill Pazner``` 56*6f5dc8baSWill PaznerWhen adding the inputs and outputs, `CEED_EVAL_INTERP` indicates that the $B$ 57*6f5dc8baSWill Paznerbasis operator should be used to interpolate the trial and test functions from 58*6f5dc8baSWill Paznernodal points to quadrature points, and `CEED_EVAL_NONE` indicates that the 59*6f5dc8baSWill Pazner`qdata` is already precomputed at quadrature points, and no interpolation is 60*6f5dc8baSWill Paznerrequried. 61*6f5dc8baSWill Pazner 62*6f5dc8baSWill Pazner## Apply mass Q-function in Julia 63*6f5dc8baSWill Pazner 64*6f5dc8baSWill PaznerWe now replicate this Q-function in Julia. The main way of defining user 65*6f5dc8baSWill PaznerQ-functions in Julia is using the [`@interior_qf`](@ref) macro. The above C code 66*6f5dc8baSWill Pazner(both the definition of the Q-function, its creation, and adding the inputs and 67*6f5dc8baSWill Pazneroutputs) is analogous to the following Julia code: 68*6f5dc8baSWill Pazner 69*6f5dc8baSWill Pazner```julia 70*6f5dc8baSWill Pazner@interior_qf apply_qfunc = ( 71*6f5dc8baSWill Pazner ceed, Q, 72*6f5dc8baSWill Pazner (u, :in, EVAL_INTERP, Q), (qdata, :in, EVAL_NONE, Q), 73*6f5dc8baSWill Pazner (v, :out, EVAL_INTERP, Q), 74*6f5dc8baSWill Pazner @inbounds @simd for i=1:Q 75*6f5dc8baSWill Pazner v[i] = qdata[i]*u[i] 76*6f5dc8baSWill Pazner end 77*6f5dc8baSWill Pazner) 78*6f5dc8baSWill Pazner``` 79*6f5dc8baSWill Pazner 80*6f5dc8baSWill PaznerThis creates a [`QFunction`](@ref) object named `apply_qfunc`. The Q-function is 81*6f5dc8baSWill Paznerdefined by the tuple on the right-hand side. `ceed` is the name of the 82*6f5dc8baSWill Pazner[`Ceed`](@ref) object where the Q-function will be created, and the second 83*6f5dc8baSWill Paznerargument, `Q`, is the name of that variable that will contain the number of 84*6f5dc8baSWill Paznerquadrature points. The next three arguments are specifications of the input and 85*6f5dc8baSWill Pazneroutput fields: 86*6f5dc8baSWill Pazner```julia 87*6f5dc8baSWill Pazner (u, :in, EVAL_INTERP, Q), 88*6f5dc8baSWill Pazner (qdata, :in, EVAL_NONE, Q), 89*6f5dc8baSWill Pazner (v, :out, EVAL_INTERP, Q), 90*6f5dc8baSWill Pazner``` 91*6f5dc8baSWill PaznerEach input or output field specification is a tuple, where the first entry is 92*6f5dc8baSWill Paznerthe name of the array, and the second entry is either `:in` or `:out`, according 93*6f5dc8baSWill Paznerto whether the array is an input or output array. The third entry is the 94*6f5dc8baSWill Pazner[`EvalMode`](@ref) of the field. The remaining entries are the dimensions of the 95*6f5dc8baSWill Paznerarray. The first dimension is always equal to the number of quadrature points. 96*6f5dc8baSWill PaznerIn this case, all the arrays are simply vectors whose size is equal to the 97*6f5dc8baSWill Paznernumber of quadrature points, but in more sophisticated examples (e.g. the [apply 98*6f5dc8baSWill Paznerdiffusion Q-function](@ref applydiff)) these arrays could consists of vectors or 99*6f5dc8baSWill Paznermatrices at each quadrature point. After providing all of the array 100*6f5dc8baSWill Paznerspecifications, the body of the Q-function is provided. 101*6f5dc8baSWill Pazner 102*6f5dc8baSWill Pazner## [Apply diffusion Q-function in Julia](@id applydiff) 103*6f5dc8baSWill Pazner 104*6f5dc8baSWill PaznerFor a more sophisticated example of a Q-function, we consider the "apply 105*6f5dc8baSWill Paznerdiffusion" Q-function, used in `ex2-surface`. This Q-function computes the 106*6f5dc8baSWill Pazneraction of the diffusion operator. When written in the form $B^\intercal D B$, in 107*6f5dc8baSWill Paznerthis case $B$ represents the basis gradient matrix, and $D$ represents 108*6f5dc8baSWill Paznermultiplication by $w \det(J) J^{-\intercal} J^{-1}$, where $J$ is the mesh 109*6f5dc8baSWill Paznertransformation Jacobian, and $w$ is the quadrature weight. 110*6f5dc8baSWill Pazner 111*6f5dc8baSWill PaznerThis Q-function is implemented in Julia as follows: 112*6f5dc8baSWill Pazner```julia 113*6f5dc8baSWill Pazner@interior_qf apply_qfunc = ( 114*6f5dc8baSWill Pazner ceed, Q, dim=dim, 115*6f5dc8baSWill Pazner (du, :in, EVAL_GRAD, Q, dim), 116*6f5dc8baSWill Pazner (qdata, :in, EVAL_NONE, Q, dim*(dim+1)÷2), 117*6f5dc8baSWill Pazner (dv, :out, EVAL_GRAD, Q, dim), 118*6f5dc8baSWill Pazner @inbounds @simd for i=1:Q 119*6f5dc8baSWill Pazner dXdxdXdxT = getvoigt(@view(qdata[i,:]), CeedDim(dim)) 120*6f5dc8baSWill Pazner dui = SVector{dim}(@view(du[i,:])) 121*6f5dc8baSWill Pazner dv[i,:] .= dXdxdXdxT*dui 122*6f5dc8baSWill Pazner end 123*6f5dc8baSWill Pazner) 124*6f5dc8baSWill Pazner``` 125*6f5dc8baSWill PaznerIn contrast to the previous example, before the field specifications, this 126*6f5dc8baSWill PaznerQ-function includes a _constant definition_ `dim=dim`. The 127*6f5dc8baSWill Pazner[`@interior_qf`](@ref) macro allows for any number of constant definitions, 128*6f5dc8baSWill Paznerwhich make the specified values available within the body of the Q-function as 129*6f5dc8baSWill Paznercompile-time constants. 130*6f5dc8baSWill Pazner 131*6f5dc8baSWill PaznerIn this example, `dim` is either 1, 2, or 3 according to the spatial dimension 132*6f5dc8baSWill Paznerof the problem. When the user Q-function is defined, LibCEED.jl will JIT compile 133*6f5dc8baSWill Paznerthe body of the Q-function and make it available to libCEED as a C callback. In 134*6f5dc8baSWill Paznerthe body of this Q-function, `dim` will be available, and its value will be a 135*6f5dc8baSWill Paznercompile-time constant, allowing for (static) dispatch based on the value of 136*6f5dc8baSWill Pazner`dim`, and eliminating branching. 137*6f5dc8baSWill Pazner 138*6f5dc8baSWill PaznerNote that `dim` is also available for use in the field specifications. In this 139*6f5dc8baSWill Paznerexample, the field specifications are slightly more involved that in the 140*6f5dc8baSWill Paznerprevious example. The arrays are given by 141*6f5dc8baSWill Pazner```julia 142*6f5dc8baSWill Pazner (du, :in, EVAL_GRAD, Q, dim), 143*6f5dc8baSWill Pazner (qdata, :in, EVAL_NONE, Q, dim*(dim+1)÷2), 144*6f5dc8baSWill Pazner (dv, :out, EVAL_GRAD, Q, dim), 145*6f5dc8baSWill Pazner``` 146*6f5dc8baSWill PaznerNote that the input array `du` has [`EvalMode`](@ref) `EVAL_GRAD`, meaning that 147*6f5dc8baSWill Paznerthis array stores the gradient of the trial function at each quadrature point. 148*6f5dc8baSWill PaznerTherefore, at each quadrature point, `du` stores a vector of length `dim`, and 149*6f5dc8baSWill Paznerso the shape of `du` is `(Q, dim)`. Similarly, the action of $D$ is given by 150*6f5dc8baSWill Pazner$w \det(J) J^{-\intercal} J^{-1} \nabla u$, which is also a vector of length `dim` at 151*6f5dc8baSWill Paznereach quadrature point. This means that the output array `dv` also has shape 152*6f5dc8baSWill Pazner`(Q, dim)`. 153*6f5dc8baSWill Pazner 154*6f5dc8baSWill PaznerThe geometric factors stored in `qdata` represent the symmetric matrix $w 155*6f5dc8baSWill Pazner\det(J) J^{-\intercal} J^{-1}$ evaluated at every quadrature point. In order to 156*6f5dc8baSWill Paznerreduce data usage, instead of storing this data as a $d \times d$ matrix, we use 157*6f5dc8baSWill Paznerthe fact that we know it is symmetric to only store $d(d+1)/2$ entries, and the 158*6f5dc8baSWill Paznerremaining entries we infer by symmetry. These entries are stored using the 159*6f5dc8baSWill Pazner[Voigt convention](https://en.wikipedia.org/wiki/Voigt_notation). LibCEED.jl 160*6f5dc8baSWill Paznerprovides some [utilities](Misc.md#LibCEED.getvoigt) for storing and extracting 161*6f5dc8baSWill Paznersymmetric matrices stored in this fashion. 162*6f5dc8baSWill Pazner 163*6f5dc8baSWill PaznerAfter the field specifications, we have the body of the Q-function: 164*6f5dc8baSWill Pazner```julia 165*6f5dc8baSWill Pazner@inbounds @simd for i=1:Q 166*6f5dc8baSWill Pazner dXdxdXdxT = getvoigt(@view(qdata[i,:]), CeedDim(dim)) 167*6f5dc8baSWill Pazner dui = SVector{dim}(@view(du[i,:])) 168*6f5dc8baSWill Pazner dv[i,:] .= dXdxdXdxT*dui 169*6f5dc8baSWill Paznerend 170*6f5dc8baSWill Pazner``` 171*6f5dc8baSWill PaznerFirst, the matrix $w \det(J) J^{-\intercal} J^{-1}$ is stored in the variable 172*6f5dc8baSWill Pazner`dXdxdXdxT`. The symmetric entries of this matrix are accesed using 173*6f5dc8baSWill Pazner`@view(qdata[i,:])`, which avoids allocations. [`getvoigt`](@ref) is used to 174*6f5dc8baSWill Paznerconvert from Voigt notation to a symmetric matrix, which returns a statically 175*6f5dc8baSWill Paznersized `SMatrix`. The version for the correct spatial dimension is selected using 176*6f5dc8baSWill Pazner`CeedDim(dim)`, which allows for compile-time dispatch, since `dim` is a 177*6f5dc8baSWill Paznerconstant whose value is known as a constant when the Q-function is JIT compiled. 178*6f5dc8baSWill Pazner 179*6f5dc8baSWill PaznerThen, the gradient of $u$ at the given quadrature point is loaded as a 180*6f5dc8baSWill Paznerfixed-size `SVector`. The result is placed into the output array, where the 181*6f5dc8baSWill Pazner[`StaticArrays.jl`](https://github.com/JuliaArrays/StaticArrays.jl) package 182*6f5dc8baSWill Paznerevaluates `dXdxdXdxT*dui` using an optimized matrix-vector product for small 183*6f5dc8baSWill Paznermatrices (since their sizes are known statically). 184*6f5dc8baSWill Pazner 185*6f5dc8baSWill Pazner## GPU Kernels 186*6f5dc8baSWill Pazner 187*6f5dc8baSWill PaznerIf the `Ceed` resource uses a CUDA backend, then the user Q-functions defined 188*6f5dc8baSWill Paznerusing [`@interior_qf`](@ref) are automatically compiled as CUDA kernels using 189*6f5dc8baSWill Pazner[`CUDA.jl`](https://github.com/JuliaGPU/CUDA.jl). Some Julia features are not 190*6f5dc8baSWill Pazneravailable in GPU code (for example, dynamic dispatch), so if the Q-function is 191*6f5dc8baSWill Paznerintended to be run on the GPU, the user should take care when defining the body 192*6f5dc8baSWill Paznerof the user Q-function. 193