16f5dc8baSWill Pazner# Defining User Q-Functions 26f5dc8baSWill Pazner 36f5dc8baSWill PaznerAn important feature of LibCEED.jl is the ability to define [user 4*13964f07SJed BrownQ-functions](https://libceed.org/en/latest/libCEEDapi/#gallery-of-qfunctions) 56f5dc8baSWill Paznernatively in Julia. These user Q-functions work with both the CPU and CUDA 66f5dc8baSWill Paznerbackends. 76f5dc8baSWill Pazner 86f5dc8baSWill PaznerUser Q-functions describe the action of the $D$ operator at quadrature points 96f5dc8baSWill Pazner(see [libCEED's theoretical 10*13964f07SJed Brownframework](https://libceed.org/en/latest/libCEEDapi/#theoretical-framework)). 116f5dc8baSWill PaznerSince the Q-functions are invoked at every quadrature point, efficiency is 126f5dc8baSWill Paznervery important. 136f5dc8baSWill Pazner 146f5dc8baSWill Pazner## Apply mass Q-function in C 156f5dc8baSWill Pazner 166f5dc8baSWill PaznerBefore describing how to define user Q-functions in Julia, we will briefly given 176f5dc8baSWill Pazneran example of a user Q-function defined in C. This is the "apply mass" 186f5dc8baSWill PaznerQ-function from `ex1-volume.c`, which computes the action of the mass operator. 196f5dc8baSWill PaznerThe mass operator on each element can be written as $B^\intercal D B$, where $B$ 206f5dc8baSWill Pazneris the basis operator, and $D$ represents multiplication by quadrature weights 216f5dc8baSWill Paznerand geometric factors (i.e. the determinant of the mesh transformation Jacobian 226f5dc8baSWill Paznerat each qudarture point). It is the action of $D$ that the Q-function must 236f5dc8baSWill Paznerimplement. The C source of the Q-function is: 246f5dc8baSWill Pazner 256f5dc8baSWill Pazner```c 266f5dc8baSWill Pazner/// libCEED Q-function for applying a mass operator 276f5dc8baSWill PaznerCEED_QFUNCTION(f_apply_mass)(void *ctx, const CeedInt Q, 286f5dc8baSWill Pazner const CeedScalar *const *in, 296f5dc8baSWill Pazner CeedScalar *const *out) { 306f5dc8baSWill Pazner const CeedScalar *u = in[0], *qdata = in[1]; 316f5dc8baSWill Pazner CeedScalar *v = out[0]; 326f5dc8baSWill Pazner // Quadrature Point Loop 336f5dc8baSWill Pazner CeedPragmaSIMD 346f5dc8baSWill Pazner for (CeedInt i=0; i<Q; i++) { 356f5dc8baSWill Pazner v[i] = qdata[i] * u[i]; 366f5dc8baSWill Pazner } // End of Quadrature Point Loop 376f5dc8baSWill Pazner return 0; 386f5dc8baSWill Pazner} 396f5dc8baSWill Pazner``` 406f5dc8baSWill Pazner 416f5dc8baSWill PaznerFrom this example, we see that a user Q-function is a C callback that takes a 426f5dc8baSWill Pazner"data context" pointer, a number of quadrature points, and two arrays of arrays, 436f5dc8baSWill Paznerone for inputs, and one for outputs. 446f5dc8baSWill Pazner 456f5dc8baSWill PaznerIn this example, the first input array is `u`, which is the value of the trial 466f5dc8baSWill Paznerfunction evaluated at each quadrature point. The second input array is `qdata`, 476f5dc8baSWill Paznerwhich contains the precomputed geometric factors. There is only one output 486f5dc8baSWill Paznerarray, `v`, which will store the pointwise product of `u` and `data`. Given the 496f5dc8baSWill Paznerdefinition of this Q-function, the `CeedQFunction` object is created by 506f5dc8baSWill Pazner```c 516f5dc8baSWill PaznerCeedQFunctionCreateInterior(ceed, 1, f_apply_mass, f_apply_mass_loc, &apply_qfunc); 526f5dc8baSWill PaznerCeedQFunctionAddInput(apply_qfunc, "u", 1, CEED_EVAL_INTERP); 536f5dc8baSWill PaznerCeedQFunctionAddInput(apply_qfunc, "qdata", 1, CEED_EVAL_NONE); 546f5dc8baSWill PaznerCeedQFunctionAddOutput(apply_qfunc, "v", 1, CEED_EVAL_INTERP); 556f5dc8baSWill Pazner``` 566f5dc8baSWill PaznerWhen adding the inputs and outputs, `CEED_EVAL_INTERP` indicates that the $B$ 576f5dc8baSWill Paznerbasis operator should be used to interpolate the trial and test functions from 586f5dc8baSWill Paznernodal points to quadrature points, and `CEED_EVAL_NONE` indicates that the 596f5dc8baSWill Pazner`qdata` is already precomputed at quadrature points, and no interpolation is 606f5dc8baSWill Paznerrequried. 616f5dc8baSWill Pazner 626f5dc8baSWill Pazner## Apply mass Q-function in Julia 636f5dc8baSWill Pazner 646f5dc8baSWill PaznerWe now replicate this Q-function in Julia. The main way of defining user 656f5dc8baSWill PaznerQ-functions in Julia is using the [`@interior_qf`](@ref) macro. The above C code 666f5dc8baSWill Pazner(both the definition of the Q-function, its creation, and adding the inputs and 676f5dc8baSWill Pazneroutputs) is analogous to the following Julia code: 686f5dc8baSWill Pazner 696f5dc8baSWill Pazner```julia 706f5dc8baSWill Pazner@interior_qf apply_qfunc = ( 716f5dc8baSWill Pazner ceed, Q, 726f5dc8baSWill Pazner (u, :in, EVAL_INTERP, Q), (qdata, :in, EVAL_NONE, Q), 736f5dc8baSWill Pazner (v, :out, EVAL_INTERP, Q), 746f5dc8baSWill Pazner @inbounds @simd for i=1:Q 756f5dc8baSWill Pazner v[i] = qdata[i]*u[i] 766f5dc8baSWill Pazner end 776f5dc8baSWill Pazner) 786f5dc8baSWill Pazner``` 796f5dc8baSWill Pazner 806f5dc8baSWill PaznerThis creates a [`QFunction`](@ref) object named `apply_qfunc`. The Q-function is 816f5dc8baSWill Paznerdefined by the tuple on the right-hand side. `ceed` is the name of the 826f5dc8baSWill Pazner[`Ceed`](@ref) object where the Q-function will be created, and the second 836f5dc8baSWill Paznerargument, `Q`, is the name of that variable that will contain the number of 846f5dc8baSWill Paznerquadrature points. The next three arguments are specifications of the input and 856f5dc8baSWill Pazneroutput fields: 866f5dc8baSWill Pazner```julia 876f5dc8baSWill Pazner (u, :in, EVAL_INTERP, Q), 886f5dc8baSWill Pazner (qdata, :in, EVAL_NONE, Q), 896f5dc8baSWill Pazner (v, :out, EVAL_INTERP, Q), 906f5dc8baSWill Pazner``` 916f5dc8baSWill PaznerEach input or output field specification is a tuple, where the first entry is 926f5dc8baSWill Paznerthe name of the array, and the second entry is either `:in` or `:out`, according 936f5dc8baSWill Paznerto whether the array is an input or output array. The third entry is the 946f5dc8baSWill Pazner[`EvalMode`](@ref) of the field. The remaining entries are the dimensions of the 956f5dc8baSWill Paznerarray. The first dimension is always equal to the number of quadrature points. 966f5dc8baSWill PaznerIn this case, all the arrays are simply vectors whose size is equal to the 976f5dc8baSWill Paznernumber of quadrature points, but in more sophisticated examples (e.g. the [apply 986f5dc8baSWill Paznerdiffusion Q-function](@ref applydiff)) these arrays could consists of vectors or 996f5dc8baSWill Paznermatrices at each quadrature point. After providing all of the array 1006f5dc8baSWill Paznerspecifications, the body of the Q-function is provided. 1016f5dc8baSWill Pazner 1026f5dc8baSWill Pazner## [Apply diffusion Q-function in Julia](@id applydiff) 1036f5dc8baSWill Pazner 1046f5dc8baSWill PaznerFor a more sophisticated example of a Q-function, we consider the "apply 1056f5dc8baSWill Paznerdiffusion" Q-function, used in `ex2-surface`. This Q-function computes the 1066f5dc8baSWill Pazneraction of the diffusion operator. When written in the form $B^\intercal D B$, in 1076f5dc8baSWill Paznerthis case $B$ represents the basis gradient matrix, and $D$ represents 1086f5dc8baSWill Paznermultiplication by $w \det(J) J^{-\intercal} J^{-1}$, where $J$ is the mesh 1096f5dc8baSWill Paznertransformation Jacobian, and $w$ is the quadrature weight. 1106f5dc8baSWill Pazner 1116f5dc8baSWill PaznerThis Q-function is implemented in Julia as follows: 1126f5dc8baSWill Pazner```julia 1136f5dc8baSWill Pazner@interior_qf apply_qfunc = ( 1146f5dc8baSWill Pazner ceed, Q, dim=dim, 1156f5dc8baSWill Pazner (du, :in, EVAL_GRAD, Q, dim), 1166f5dc8baSWill Pazner (qdata, :in, EVAL_NONE, Q, dim*(dim+1)÷2), 1176f5dc8baSWill Pazner (dv, :out, EVAL_GRAD, Q, dim), 1186f5dc8baSWill Pazner @inbounds @simd for i=1:Q 1196f5dc8baSWill Pazner dXdxdXdxT = getvoigt(@view(qdata[i,:]), CeedDim(dim)) 1206f5dc8baSWill Pazner dui = SVector{dim}(@view(du[i,:])) 1216f5dc8baSWill Pazner dv[i,:] .= dXdxdXdxT*dui 1226f5dc8baSWill Pazner end 1236f5dc8baSWill Pazner) 1246f5dc8baSWill Pazner``` 1256f5dc8baSWill PaznerIn contrast to the previous example, before the field specifications, this 1266f5dc8baSWill PaznerQ-function includes a _constant definition_ `dim=dim`. The 1276f5dc8baSWill Pazner[`@interior_qf`](@ref) macro allows for any number of constant definitions, 1286f5dc8baSWill Paznerwhich make the specified values available within the body of the Q-function as 1296f5dc8baSWill Paznercompile-time constants. 1306f5dc8baSWill Pazner 1316f5dc8baSWill PaznerIn this example, `dim` is either 1, 2, or 3 according to the spatial dimension 1326f5dc8baSWill Paznerof the problem. When the user Q-function is defined, LibCEED.jl will JIT compile 1336f5dc8baSWill Paznerthe body of the Q-function and make it available to libCEED as a C callback. In 1346f5dc8baSWill Paznerthe body of this Q-function, `dim` will be available, and its value will be a 1356f5dc8baSWill Paznercompile-time constant, allowing for (static) dispatch based on the value of 1366f5dc8baSWill Pazner`dim`, and eliminating branching. 1376f5dc8baSWill Pazner 1386f5dc8baSWill PaznerNote that `dim` is also available for use in the field specifications. In this 1396f5dc8baSWill Paznerexample, the field specifications are slightly more involved that in the 1406f5dc8baSWill Paznerprevious example. The arrays are given by 1416f5dc8baSWill Pazner```julia 1426f5dc8baSWill Pazner (du, :in, EVAL_GRAD, Q, dim), 1436f5dc8baSWill Pazner (qdata, :in, EVAL_NONE, Q, dim*(dim+1)÷2), 1446f5dc8baSWill Pazner (dv, :out, EVAL_GRAD, Q, dim), 1456f5dc8baSWill Pazner``` 1466f5dc8baSWill PaznerNote that the input array `du` has [`EvalMode`](@ref) `EVAL_GRAD`, meaning that 1476f5dc8baSWill Paznerthis array stores the gradient of the trial function at each quadrature point. 1486f5dc8baSWill PaznerTherefore, at each quadrature point, `du` stores a vector of length `dim`, and 1496f5dc8baSWill Paznerso the shape of `du` is `(Q, dim)`. Similarly, the action of $D$ is given by 1506f5dc8baSWill Pazner$w \det(J) J^{-\intercal} J^{-1} \nabla u$, which is also a vector of length `dim` at 1516f5dc8baSWill Paznereach quadrature point. This means that the output array `dv` also has shape 1526f5dc8baSWill Pazner`(Q, dim)`. 1536f5dc8baSWill Pazner 1546f5dc8baSWill PaznerThe geometric factors stored in `qdata` represent the symmetric matrix $w 1556f5dc8baSWill Pazner\det(J) J^{-\intercal} J^{-1}$ evaluated at every quadrature point. In order to 1566f5dc8baSWill Paznerreduce data usage, instead of storing this data as a $d \times d$ matrix, we use 1576f5dc8baSWill Paznerthe fact that we know it is symmetric to only store $d(d+1)/2$ entries, and the 1586f5dc8baSWill Paznerremaining entries we infer by symmetry. These entries are stored using the 1596f5dc8baSWill Pazner[Voigt convention](https://en.wikipedia.org/wiki/Voigt_notation). LibCEED.jl 1606f5dc8baSWill Paznerprovides some [utilities](Misc.md#LibCEED.getvoigt) for storing and extracting 1616f5dc8baSWill Paznersymmetric matrices stored in this fashion. 1626f5dc8baSWill Pazner 1636f5dc8baSWill PaznerAfter the field specifications, we have the body of the Q-function: 1646f5dc8baSWill Pazner```julia 1656f5dc8baSWill Pazner@inbounds @simd for i=1:Q 1666f5dc8baSWill Pazner dXdxdXdxT = getvoigt(@view(qdata[i,:]), CeedDim(dim)) 1676f5dc8baSWill Pazner dui = SVector{dim}(@view(du[i,:])) 1686f5dc8baSWill Pazner dv[i,:] .= dXdxdXdxT*dui 1696f5dc8baSWill Paznerend 1706f5dc8baSWill Pazner``` 1716f5dc8baSWill PaznerFirst, the matrix $w \det(J) J^{-\intercal} J^{-1}$ is stored in the variable 1726f5dc8baSWill Pazner`dXdxdXdxT`. The symmetric entries of this matrix are accesed using 1736f5dc8baSWill Pazner`@view(qdata[i,:])`, which avoids allocations. [`getvoigt`](@ref) is used to 1746f5dc8baSWill Paznerconvert from Voigt notation to a symmetric matrix, which returns a statically 1756f5dc8baSWill Paznersized `SMatrix`. The version for the correct spatial dimension is selected using 1766f5dc8baSWill Pazner`CeedDim(dim)`, which allows for compile-time dispatch, since `dim` is a 1776f5dc8baSWill Paznerconstant whose value is known as a constant when the Q-function is JIT compiled. 1786f5dc8baSWill Pazner 1796f5dc8baSWill PaznerThen, the gradient of $u$ at the given quadrature point is loaded as a 1806f5dc8baSWill Paznerfixed-size `SVector`. The result is placed into the output array, where the 1816f5dc8baSWill Pazner[`StaticArrays.jl`](https://github.com/JuliaArrays/StaticArrays.jl) package 1826f5dc8baSWill Paznerevaluates `dXdxdXdxT*dui` using an optimized matrix-vector product for small 1836f5dc8baSWill Paznermatrices (since their sizes are known statically). 1846f5dc8baSWill Pazner 1856f5dc8baSWill Pazner## GPU Kernels 1866f5dc8baSWill Pazner 1876f5dc8baSWill PaznerIf the `Ceed` resource uses a CUDA backend, then the user Q-functions defined 1886f5dc8baSWill Paznerusing [`@interior_qf`](@ref) are automatically compiled as CUDA kernels using 1896f5dc8baSWill Pazner[`CUDA.jl`](https://github.com/JuliaGPU/CUDA.jl). Some Julia features are not 1906f5dc8baSWill Pazneravailable in GPU code (for example, dynamic dispatch), so if the Q-function is 1916f5dc8baSWill Paznerintended to be run on the GPU, the user should take care when defining the body 1926f5dc8baSWill Paznerof the user Q-function. 193