docs/src/UserQFunctions.md

6f5dc8baSWill Pazner# Defining User Q-Functions
6f5dc8baSWill Pazner
6f5dc8baSWill PaznerAn important feature of LibCEED.jl is the ability to define [user
*13964f07SJed BrownQ-functions](https://libceed.org/en/latest/libCEEDapi/#gallery-of-qfunctions)
6f5dc8baSWill Paznernatively in Julia. These user Q-functions work with both the CPU and CUDA
6f5dc8baSWill Paznerbackends.
6f5dc8baSWill Pazner
6f5dc8baSWill PaznerUser Q-functions describe the action of the $D$ operator at quadrature points
6f5dc8baSWill Pazner(see [libCEED's theoretical
*13964f07SJed Brownframework](https://libceed.org/en/latest/libCEEDapi/#theoretical-framework)).
6f5dc8baSWill PaznerSince the Q-functions are invoked at every quadrature point, efficiency is
6f5dc8baSWill Paznervery important.
6f5dc8baSWill Pazner
6f5dc8baSWill Pazner## Apply mass Q-function in C
6f5dc8baSWill Pazner
6f5dc8baSWill PaznerBefore describing how to define user Q-functions in Julia, we will briefly given
6f5dc8baSWill Pazneran example of a user Q-function defined in C. This is the "apply mass"
6f5dc8baSWill PaznerQ-function from `ex1-volume.c`, which computes the action of the mass operator.
6f5dc8baSWill PaznerThe mass operator on each element can be written as $B^\intercal D B$, where $B$
6f5dc8baSWill Pazneris the basis operator, and $D$ represents multiplication by quadrature weights
6f5dc8baSWill Paznerand geometric factors (i.e. the determinant of the mesh transformation Jacobian
6f5dc8baSWill Paznerat each qudarture point). It is the action of $D$ that the Q-function must
6f5dc8baSWill Paznerimplement. The C source of the Q-function is:
6f5dc8baSWill Pazner
6f5dc8baSWill Pazner```c
6f5dc8baSWill Pazner/// libCEED Q-function for applying a mass operator
6f5dc8baSWill PaznerCEED_QFUNCTION(f_apply_mass)(void *ctx, const CeedInt Q,
6f5dc8baSWill Pazner                             const CeedScalar *const *in,
6f5dc8baSWill Pazner                             CeedScalar *const *out) {
6f5dc8baSWill Pazner  const CeedScalar *u = in[0], *qdata = in[1];
6f5dc8baSWill Pazner  CeedScalar *v = out[0];
6f5dc8baSWill Pazner  // Quadrature Point Loop
6f5dc8baSWill Pazner  CeedPragmaSIMD
6f5dc8baSWill Pazner  for (CeedInt i=0; i<Q; i++) {
6f5dc8baSWill Pazner    v[i] = qdata[i] * u[i];
6f5dc8baSWill Pazner  } // End of Quadrature Point Loop
6f5dc8baSWill Pazner  return 0;
6f5dc8baSWill Pazner}
6f5dc8baSWill Pazner```
6f5dc8baSWill Pazner
6f5dc8baSWill PaznerFrom this example, we see that a user Q-function is a C callback that takes a
6f5dc8baSWill Pazner"data context" pointer, a number of quadrature points, and two arrays of arrays,
6f5dc8baSWill Paznerone for inputs, and one for outputs.
6f5dc8baSWill Pazner
6f5dc8baSWill PaznerIn this example, the first input array is `u`, which is the value of the trial
6f5dc8baSWill Paznerfunction evaluated at each quadrature point. The second input array is `qdata`,
6f5dc8baSWill Paznerwhich contains the precomputed geometric factors. There is only one output
6f5dc8baSWill Paznerarray, `v`, which will store the pointwise product of `u` and `data`. Given the
6f5dc8baSWill Paznerdefinition of this Q-function, the `CeedQFunction` object is created by
6f5dc8baSWill Pazner```c
6f5dc8baSWill PaznerCeedQFunctionCreateInterior(ceed, 1, f_apply_mass, f_apply_mass_loc, &apply_qfunc);
6f5dc8baSWill PaznerCeedQFunctionAddInput(apply_qfunc, "u", 1, CEED_EVAL_INTERP);
6f5dc8baSWill PaznerCeedQFunctionAddInput(apply_qfunc, "qdata", 1, CEED_EVAL_NONE);
6f5dc8baSWill PaznerCeedQFunctionAddOutput(apply_qfunc, "v", 1, CEED_EVAL_INTERP);
6f5dc8baSWill Pazner```
6f5dc8baSWill PaznerWhen adding the inputs and outputs, `CEED_EVAL_INTERP` indicates that the $B$
6f5dc8baSWill Paznerbasis operator should be used to interpolate the trial and test functions from
6f5dc8baSWill Paznernodal points to quadrature points, and `CEED_EVAL_NONE` indicates that the
6f5dc8baSWill Pazner`qdata` is already precomputed at quadrature points, and no interpolation is
6f5dc8baSWill Paznerrequried.
6f5dc8baSWill Pazner
6f5dc8baSWill Pazner## Apply mass Q-function in Julia
6f5dc8baSWill Pazner
6f5dc8baSWill PaznerWe now replicate this Q-function in Julia. The main way of defining user
6f5dc8baSWill PaznerQ-functions in Julia is using the [`@interior_qf`](@ref) macro. The above C code
6f5dc8baSWill Pazner(both the definition of the Q-function, its creation, and adding the inputs and
6f5dc8baSWill Pazneroutputs) is analogous to the following Julia code:
6f5dc8baSWill Pazner
6f5dc8baSWill Pazner```julia
6f5dc8baSWill Pazner@interior_qf apply_qfunc = (
6f5dc8baSWill Pazner    ceed, Q,
6f5dc8baSWill Pazner    (u, :in, EVAL_INTERP, Q), (qdata, :in, EVAL_NONE, Q),
6f5dc8baSWill Pazner    (v, :out, EVAL_INTERP, Q),
6f5dc8baSWill Pazner    @inbounds @simd for i=1:Q
6f5dc8baSWill Pazner        v[i] = qdata[i]*u[i]
6f5dc8baSWill Pazner    end
6f5dc8baSWill Pazner)
6f5dc8baSWill Pazner```
6f5dc8baSWill Pazner
6f5dc8baSWill PaznerThis creates a [`QFunction`](@ref) object named `apply_qfunc`. The Q-function is
6f5dc8baSWill Paznerdefined by the tuple on the right-hand side. `ceed` is the name of the
6f5dc8baSWill Pazner[`Ceed`](@ref) object where the Q-function will be created, and the second
6f5dc8baSWill Paznerargument, `Q`, is the name of that variable that will contain the number of
6f5dc8baSWill Paznerquadrature points. The next three arguments are specifications of the input and
6f5dc8baSWill Pazneroutput fields:
6f5dc8baSWill Pazner```julia
6f5dc8baSWill Pazner    (u, :in, EVAL_INTERP, Q),
6f5dc8baSWill Pazner    (qdata, :in, EVAL_NONE, Q),
6f5dc8baSWill Pazner    (v, :out, EVAL_INTERP, Q),
6f5dc8baSWill Pazner```
6f5dc8baSWill PaznerEach input or output field specification is a tuple, where the first entry is
6f5dc8baSWill Paznerthe name of the array, and the second entry is either `:in` or `:out`, according
6f5dc8baSWill Paznerto whether the array is an input or output array. The third entry is the
6f5dc8baSWill Pazner[`EvalMode`](@ref) of the field. The remaining entries are the dimensions of the
6f5dc8baSWill Paznerarray. The first dimension is always equal to the number of quadrature points.
6f5dc8baSWill PaznerIn this case, all the arrays are simply vectors whose size is equal to the
6f5dc8baSWill Paznernumber of quadrature points, but in more sophisticated examples (e.g. the [apply
6f5dc8baSWill Paznerdiffusion Q-function](@ref applydiff)) these arrays could consists of vectors or
6f5dc8baSWill Paznermatrices at each quadrature point. After providing all of the array
6f5dc8baSWill Paznerspecifications, the body of the Q-function is provided.
6f5dc8baSWill Pazner
6f5dc8baSWill Pazner## [Apply diffusion Q-function in Julia](@id applydiff)
6f5dc8baSWill Pazner
6f5dc8baSWill PaznerFor a more sophisticated example of a Q-function, we consider the "apply
6f5dc8baSWill Paznerdiffusion" Q-function, used in `ex2-surface`. This Q-function computes the
6f5dc8baSWill Pazneraction of the diffusion operator. When written in the form $B^\intercal D B$, in
6f5dc8baSWill Paznerthis case $B$ represents the basis gradient matrix, and $D$ represents
6f5dc8baSWill Paznermultiplication by $w \det(J) J^{-\intercal} J^{-1}$, where $J$ is the mesh
6f5dc8baSWill Paznertransformation Jacobian, and $w$ is the quadrature weight.
6f5dc8baSWill Pazner
6f5dc8baSWill PaznerThis Q-function is implemented in Julia as follows:
6f5dc8baSWill Pazner```julia
6f5dc8baSWill Pazner@interior_qf apply_qfunc = (
6f5dc8baSWill Pazner    ceed, Q, dim=dim,
6f5dc8baSWill Pazner    (du, :in, EVAL_GRAD, Q, dim),
6f5dc8baSWill Pazner    (qdata, :in, EVAL_NONE, Q, dim*(dim+1)÷2),
6f5dc8baSWill Pazner    (dv, :out, EVAL_GRAD, Q, dim),
6f5dc8baSWill Pazner    @inbounds @simd for i=1:Q
6f5dc8baSWill Pazner        dXdxdXdxT = getvoigt(@view(qdata[i,:]), CeedDim(dim))
6f5dc8baSWill Pazner        dui = SVector{dim}(@view(du[i,:]))
6f5dc8baSWill Pazner        dv[i,:] .= dXdxdXdxT*dui
6f5dc8baSWill Pazner    end
6f5dc8baSWill Pazner)
6f5dc8baSWill Pazner```
6f5dc8baSWill PaznerIn contrast to the previous example, before the field specifications, this
6f5dc8baSWill PaznerQ-function includes a _constant definition_ `dim=dim`. The
6f5dc8baSWill Pazner[`@interior_qf`](@ref) macro allows for any number of constant definitions,
6f5dc8baSWill Paznerwhich make the specified values available within the body of the Q-function as
6f5dc8baSWill Paznercompile-time constants.
6f5dc8baSWill Pazner
6f5dc8baSWill PaznerIn this example, `dim` is either 1, 2, or 3 according to the spatial dimension
6f5dc8baSWill Paznerof the problem. When the user Q-function is defined, LibCEED.jl will JIT compile
6f5dc8baSWill Paznerthe body of the Q-function and make it available to libCEED as a C callback. In
6f5dc8baSWill Paznerthe body of this Q-function, `dim` will be available, and its value will be a
6f5dc8baSWill Paznercompile-time constant, allowing for (static) dispatch based on the value of
6f5dc8baSWill Pazner`dim`, and eliminating branching.
6f5dc8baSWill Pazner
6f5dc8baSWill PaznerNote that `dim` is also available for use in the field specifications. In this
6f5dc8baSWill Paznerexample, the field specifications are slightly more involved that in the
6f5dc8baSWill Paznerprevious example. The arrays are given by
6f5dc8baSWill Pazner```julia
6f5dc8baSWill Pazner    (du, :in, EVAL_GRAD, Q, dim),
6f5dc8baSWill Pazner    (qdata, :in, EVAL_NONE, Q, dim*(dim+1)÷2),
6f5dc8baSWill Pazner    (dv, :out, EVAL_GRAD, Q, dim),
6f5dc8baSWill Pazner```
6f5dc8baSWill PaznerNote that the input array `du` has [`EvalMode`](@ref) `EVAL_GRAD`, meaning that
6f5dc8baSWill Paznerthis array stores the gradient of the trial function at each quadrature point.
6f5dc8baSWill PaznerTherefore, at each quadrature point, `du` stores a vector of length `dim`, and
6f5dc8baSWill Paznerso the shape of `du` is `(Q, dim)`. Similarly, the action of $D$ is given by
6f5dc8baSWill Pazner$w \det(J) J^{-\intercal} J^{-1} \nabla u$, which is also a vector of length `dim` at
6f5dc8baSWill Paznereach quadrature point. This means that the output array `dv` also has shape
6f5dc8baSWill Pazner`(Q, dim)`.
6f5dc8baSWill Pazner
6f5dc8baSWill PaznerThe geometric factors stored in `qdata` represent the symmetric matrix $w
6f5dc8baSWill Pazner\det(J) J^{-\intercal} J^{-1}$ evaluated at every quadrature point. In order to
6f5dc8baSWill Paznerreduce data usage, instead of storing this data as a $d \times d$ matrix, we use
6f5dc8baSWill Paznerthe fact that we know it is symmetric to only store $d(d+1)/2$ entries, and the
6f5dc8baSWill Paznerremaining entries we infer by symmetry. These entries are stored using the
6f5dc8baSWill Pazner[Voigt convention](https://en.wikipedia.org/wiki/Voigt_notation). LibCEED.jl
6f5dc8baSWill Paznerprovides some [utilities](Misc.md#LibCEED.getvoigt) for storing and extracting
6f5dc8baSWill Paznersymmetric matrices stored in this fashion.
6f5dc8baSWill Pazner
6f5dc8baSWill PaznerAfter the field specifications, we have the body of the Q-function:
6f5dc8baSWill Pazner```julia
6f5dc8baSWill Pazner@inbounds @simd for i=1:Q
6f5dc8baSWill Pazner    dXdxdXdxT = getvoigt(@view(qdata[i,:]), CeedDim(dim))
6f5dc8baSWill Pazner    dui = SVector{dim}(@view(du[i,:]))
6f5dc8baSWill Pazner    dv[i,:] .= dXdxdXdxT*dui
6f5dc8baSWill Paznerend
6f5dc8baSWill Pazner```
6f5dc8baSWill PaznerFirst, the matrix $w \det(J) J^{-\intercal} J^{-1}$ is stored in the variable
6f5dc8baSWill Pazner`dXdxdXdxT`. The symmetric entries of this matrix are accesed using
6f5dc8baSWill Pazner`@view(qdata[i,:])`, which avoids allocations. [`getvoigt`](@ref) is used to
6f5dc8baSWill Paznerconvert from Voigt notation to a symmetric matrix, which returns a statically
6f5dc8baSWill Paznersized `SMatrix`. The version for the correct spatial dimension is selected using
6f5dc8baSWill Pazner`CeedDim(dim)`, which allows for compile-time dispatch, since `dim` is a
6f5dc8baSWill Paznerconstant whose value is known as a constant when the Q-function is JIT compiled.
6f5dc8baSWill Pazner
6f5dc8baSWill PaznerThen, the gradient of $u$ at the given quadrature point is loaded as a
6f5dc8baSWill Paznerfixed-size `SVector`. The result is placed into the output array, where the
6f5dc8baSWill Pazner[`StaticArrays.jl`](https://github.com/JuliaArrays/StaticArrays.jl) package
6f5dc8baSWill Paznerevaluates `dXdxdXdxT*dui` using an optimized matrix-vector product for small
6f5dc8baSWill Paznermatrices (since their sizes are known statically).
6f5dc8baSWill Pazner
6f5dc8baSWill Pazner## GPU Kernels
6f5dc8baSWill Pazner
6f5dc8baSWill PaznerIf the `Ceed` resource uses a CUDA backend, then the user Q-functions defined
6f5dc8baSWill Paznerusing [`@interior_qf`](@ref) are automatically compiled as CUDA kernels using
6f5dc8baSWill Pazner[`CUDA.jl`](https://github.com/JuliaGPU/CUDA.jl). Some Julia features are not
6f5dc8baSWill Pazneravailable in GPU code (for example, dynamic dispatch), so if the Q-function is
6f5dc8baSWill Paznerintended to be run on the GPU, the user should take care when defining the body
6f5dc8baSWill Paznerof the user Q-function.