xref: /libCEED/julia/LibCEED.jl/docs/src/UserQFunctions.md (revision 6f5dc8ba493eb609f13fe336f2a5e06988b93c70)
1*6f5dc8baSWill Pazner# Defining User Q-Functions
2*6f5dc8baSWill Pazner
3*6f5dc8baSWill PaznerAn important feature of LibCEED.jl is the ability to define [user
4*6f5dc8baSWill PaznerQ-functions](https://libceed.readthedocs.io/en/latest/libCEEDapi/#gallery-of-qfunctions)
5*6f5dc8baSWill Paznernatively in Julia. These user Q-functions work with both the CPU and CUDA
6*6f5dc8baSWill Paznerbackends.
7*6f5dc8baSWill Pazner
8*6f5dc8baSWill PaznerUser Q-functions describe the action of the $D$ operator at quadrature points
9*6f5dc8baSWill Pazner(see [libCEED's theoretical
10*6f5dc8baSWill Paznerframework](https://libceed.readthedocs.io/en/latest/libCEEDapi/#theoretical-framework)).
11*6f5dc8baSWill PaznerSince the Q-functions are invoked at every quadrature point, efficiency is
12*6f5dc8baSWill Paznervery important.
13*6f5dc8baSWill Pazner
14*6f5dc8baSWill Pazner## Apply mass Q-function in C
15*6f5dc8baSWill Pazner
16*6f5dc8baSWill PaznerBefore describing how to define user Q-functions in Julia, we will briefly given
17*6f5dc8baSWill Pazneran example of a user Q-function defined in C. This is the "apply mass"
18*6f5dc8baSWill PaznerQ-function from `ex1-volume.c`, which computes the action of the mass operator.
19*6f5dc8baSWill PaznerThe mass operator on each element can be written as $B^\intercal D B$, where $B$
20*6f5dc8baSWill Pazneris the basis operator, and $D$ represents multiplication by quadrature weights
21*6f5dc8baSWill Paznerand geometric factors (i.e. the determinant of the mesh transformation Jacobian
22*6f5dc8baSWill Paznerat each qudarture point). It is the action of $D$ that the Q-function must
23*6f5dc8baSWill Paznerimplement. The C source of the Q-function is:
24*6f5dc8baSWill Pazner
25*6f5dc8baSWill Pazner```c
26*6f5dc8baSWill Pazner/// libCEED Q-function for applying a mass operator
27*6f5dc8baSWill PaznerCEED_QFUNCTION(f_apply_mass)(void *ctx, const CeedInt Q,
28*6f5dc8baSWill Pazner                             const CeedScalar *const *in,
29*6f5dc8baSWill Pazner                             CeedScalar *const *out) {
30*6f5dc8baSWill Pazner  const CeedScalar *u = in[0], *qdata = in[1];
31*6f5dc8baSWill Pazner  CeedScalar *v = out[0];
32*6f5dc8baSWill Pazner  // Quadrature Point Loop
33*6f5dc8baSWill Pazner  CeedPragmaSIMD
34*6f5dc8baSWill Pazner  for (CeedInt i=0; i<Q; i++) {
35*6f5dc8baSWill Pazner    v[i] = qdata[i] * u[i];
36*6f5dc8baSWill Pazner  } // End of Quadrature Point Loop
37*6f5dc8baSWill Pazner  return 0;
38*6f5dc8baSWill Pazner}
39*6f5dc8baSWill Pazner```
40*6f5dc8baSWill Pazner
41*6f5dc8baSWill PaznerFrom this example, we see that a user Q-function is a C callback that takes a
42*6f5dc8baSWill Pazner"data context" pointer, a number of quadrature points, and two arrays of arrays,
43*6f5dc8baSWill Paznerone for inputs, and one for outputs.
44*6f5dc8baSWill Pazner
45*6f5dc8baSWill PaznerIn this example, the first input array is `u`, which is the value of the trial
46*6f5dc8baSWill Paznerfunction evaluated at each quadrature point. The second input array is `qdata`,
47*6f5dc8baSWill Paznerwhich contains the precomputed geometric factors. There is only one output
48*6f5dc8baSWill Paznerarray, `v`, which will store the pointwise product of `u` and `data`. Given the
49*6f5dc8baSWill Paznerdefinition of this Q-function, the `CeedQFunction` object is created by
50*6f5dc8baSWill Pazner```c
51*6f5dc8baSWill PaznerCeedQFunctionCreateInterior(ceed, 1, f_apply_mass, f_apply_mass_loc, &apply_qfunc);
52*6f5dc8baSWill PaznerCeedQFunctionAddInput(apply_qfunc, "u", 1, CEED_EVAL_INTERP);
53*6f5dc8baSWill PaznerCeedQFunctionAddInput(apply_qfunc, "qdata", 1, CEED_EVAL_NONE);
54*6f5dc8baSWill PaznerCeedQFunctionAddOutput(apply_qfunc, "v", 1, CEED_EVAL_INTERP);
55*6f5dc8baSWill Pazner```
56*6f5dc8baSWill PaznerWhen adding the inputs and outputs, `CEED_EVAL_INTERP` indicates that the $B$
57*6f5dc8baSWill Paznerbasis operator should be used to interpolate the trial and test functions from
58*6f5dc8baSWill Paznernodal points to quadrature points, and `CEED_EVAL_NONE` indicates that the
59*6f5dc8baSWill Pazner`qdata` is already precomputed at quadrature points, and no interpolation is
60*6f5dc8baSWill Paznerrequried.
61*6f5dc8baSWill Pazner
62*6f5dc8baSWill Pazner## Apply mass Q-function in Julia
63*6f5dc8baSWill Pazner
64*6f5dc8baSWill PaznerWe now replicate this Q-function in Julia. The main way of defining user
65*6f5dc8baSWill PaznerQ-functions in Julia is using the [`@interior_qf`](@ref) macro. The above C code
66*6f5dc8baSWill Pazner(both the definition of the Q-function, its creation, and adding the inputs and
67*6f5dc8baSWill Pazneroutputs) is analogous to the following Julia code:
68*6f5dc8baSWill Pazner
69*6f5dc8baSWill Pazner```julia
70*6f5dc8baSWill Pazner@interior_qf apply_qfunc = (
71*6f5dc8baSWill Pazner    ceed, Q,
72*6f5dc8baSWill Pazner    (u, :in, EVAL_INTERP, Q), (qdata, :in, EVAL_NONE, Q),
73*6f5dc8baSWill Pazner    (v, :out, EVAL_INTERP, Q),
74*6f5dc8baSWill Pazner    @inbounds @simd for i=1:Q
75*6f5dc8baSWill Pazner        v[i] = qdata[i]*u[i]
76*6f5dc8baSWill Pazner    end
77*6f5dc8baSWill Pazner)
78*6f5dc8baSWill Pazner```
79*6f5dc8baSWill Pazner
80*6f5dc8baSWill PaznerThis creates a [`QFunction`](@ref) object named `apply_qfunc`. The Q-function is
81*6f5dc8baSWill Paznerdefined by the tuple on the right-hand side. `ceed` is the name of the
82*6f5dc8baSWill Pazner[`Ceed`](@ref) object where the Q-function will be created, and the second
83*6f5dc8baSWill Paznerargument, `Q`, is the name of that variable that will contain the number of
84*6f5dc8baSWill Paznerquadrature points. The next three arguments are specifications of the input and
85*6f5dc8baSWill Pazneroutput fields:
86*6f5dc8baSWill Pazner```julia
87*6f5dc8baSWill Pazner    (u, :in, EVAL_INTERP, Q),
88*6f5dc8baSWill Pazner    (qdata, :in, EVAL_NONE, Q),
89*6f5dc8baSWill Pazner    (v, :out, EVAL_INTERP, Q),
90*6f5dc8baSWill Pazner```
91*6f5dc8baSWill PaznerEach input or output field specification is a tuple, where the first entry is
92*6f5dc8baSWill Paznerthe name of the array, and the second entry is either `:in` or `:out`, according
93*6f5dc8baSWill Paznerto whether the array is an input or output array. The third entry is the
94*6f5dc8baSWill Pazner[`EvalMode`](@ref) of the field. The remaining entries are the dimensions of the
95*6f5dc8baSWill Paznerarray. The first dimension is always equal to the number of quadrature points.
96*6f5dc8baSWill PaznerIn this case, all the arrays are simply vectors whose size is equal to the
97*6f5dc8baSWill Paznernumber of quadrature points, but in more sophisticated examples (e.g. the [apply
98*6f5dc8baSWill Paznerdiffusion Q-function](@ref applydiff)) these arrays could consists of vectors or
99*6f5dc8baSWill Paznermatrices at each quadrature point. After providing all of the array
100*6f5dc8baSWill Paznerspecifications, the body of the Q-function is provided.
101*6f5dc8baSWill Pazner
102*6f5dc8baSWill Pazner## [Apply diffusion Q-function in Julia](@id applydiff)
103*6f5dc8baSWill Pazner
104*6f5dc8baSWill PaznerFor a more sophisticated example of a Q-function, we consider the "apply
105*6f5dc8baSWill Paznerdiffusion" Q-function, used in `ex2-surface`. This Q-function computes the
106*6f5dc8baSWill Pazneraction of the diffusion operator. When written in the form $B^\intercal D B$, in
107*6f5dc8baSWill Paznerthis case $B$ represents the basis gradient matrix, and $D$ represents
108*6f5dc8baSWill Paznermultiplication by $w \det(J) J^{-\intercal} J^{-1}$, where $J$ is the mesh
109*6f5dc8baSWill Paznertransformation Jacobian, and $w$ is the quadrature weight.
110*6f5dc8baSWill Pazner
111*6f5dc8baSWill PaznerThis Q-function is implemented in Julia as follows:
112*6f5dc8baSWill Pazner```julia
113*6f5dc8baSWill Pazner@interior_qf apply_qfunc = (
114*6f5dc8baSWill Pazner    ceed, Q, dim=dim,
115*6f5dc8baSWill Pazner    (du, :in, EVAL_GRAD, Q, dim),
116*6f5dc8baSWill Pazner    (qdata, :in, EVAL_NONE, Q, dim*(dim+1)÷2),
117*6f5dc8baSWill Pazner    (dv, :out, EVAL_GRAD, Q, dim),
118*6f5dc8baSWill Pazner    @inbounds @simd for i=1:Q
119*6f5dc8baSWill Pazner        dXdxdXdxT = getvoigt(@view(qdata[i,:]), CeedDim(dim))
120*6f5dc8baSWill Pazner        dui = SVector{dim}(@view(du[i,:]))
121*6f5dc8baSWill Pazner        dv[i,:] .= dXdxdXdxT*dui
122*6f5dc8baSWill Pazner    end
123*6f5dc8baSWill Pazner)
124*6f5dc8baSWill Pazner```
125*6f5dc8baSWill PaznerIn contrast to the previous example, before the field specifications, this
126*6f5dc8baSWill PaznerQ-function includes a _constant definition_ `dim=dim`. The
127*6f5dc8baSWill Pazner[`@interior_qf`](@ref) macro allows for any number of constant definitions,
128*6f5dc8baSWill Paznerwhich make the specified values available within the body of the Q-function as
129*6f5dc8baSWill Paznercompile-time constants.
130*6f5dc8baSWill Pazner
131*6f5dc8baSWill PaznerIn this example, `dim` is either 1, 2, or 3 according to the spatial dimension
132*6f5dc8baSWill Paznerof the problem. When the user Q-function is defined, LibCEED.jl will JIT compile
133*6f5dc8baSWill Paznerthe body of the Q-function and make it available to libCEED as a C callback. In
134*6f5dc8baSWill Paznerthe body of this Q-function, `dim` will be available, and its value will be a
135*6f5dc8baSWill Paznercompile-time constant, allowing for (static) dispatch based on the value of
136*6f5dc8baSWill Pazner`dim`, and eliminating branching.
137*6f5dc8baSWill Pazner
138*6f5dc8baSWill PaznerNote that `dim` is also available for use in the field specifications. In this
139*6f5dc8baSWill Paznerexample, the field specifications are slightly more involved that in the
140*6f5dc8baSWill Paznerprevious example. The arrays are given by
141*6f5dc8baSWill Pazner```julia
142*6f5dc8baSWill Pazner    (du, :in, EVAL_GRAD, Q, dim),
143*6f5dc8baSWill Pazner    (qdata, :in, EVAL_NONE, Q, dim*(dim+1)÷2),
144*6f5dc8baSWill Pazner    (dv, :out, EVAL_GRAD, Q, dim),
145*6f5dc8baSWill Pazner```
146*6f5dc8baSWill PaznerNote that the input array `du` has [`EvalMode`](@ref) `EVAL_GRAD`, meaning that
147*6f5dc8baSWill Paznerthis array stores the gradient of the trial function at each quadrature point.
148*6f5dc8baSWill PaznerTherefore, at each quadrature point, `du` stores a vector of length `dim`, and
149*6f5dc8baSWill Paznerso the shape of `du` is `(Q, dim)`. Similarly, the action of $D$ is given by
150*6f5dc8baSWill Pazner$w \det(J) J^{-\intercal} J^{-1} \nabla u$, which is also a vector of length `dim` at
151*6f5dc8baSWill Paznereach quadrature point. This means that the output array `dv` also has shape
152*6f5dc8baSWill Pazner`(Q, dim)`.
153*6f5dc8baSWill Pazner
154*6f5dc8baSWill PaznerThe geometric factors stored in `qdata` represent the symmetric matrix $w
155*6f5dc8baSWill Pazner\det(J) J^{-\intercal} J^{-1}$ evaluated at every quadrature point. In order to
156*6f5dc8baSWill Paznerreduce data usage, instead of storing this data as a $d \times d$ matrix, we use
157*6f5dc8baSWill Paznerthe fact that we know it is symmetric to only store $d(d+1)/2$ entries, and the
158*6f5dc8baSWill Paznerremaining entries we infer by symmetry. These entries are stored using the
159*6f5dc8baSWill Pazner[Voigt convention](https://en.wikipedia.org/wiki/Voigt_notation). LibCEED.jl
160*6f5dc8baSWill Paznerprovides some [utilities](Misc.md#LibCEED.getvoigt) for storing and extracting
161*6f5dc8baSWill Paznersymmetric matrices stored in this fashion.
162*6f5dc8baSWill Pazner
163*6f5dc8baSWill PaznerAfter the field specifications, we have the body of the Q-function:
164*6f5dc8baSWill Pazner```julia
165*6f5dc8baSWill Pazner@inbounds @simd for i=1:Q
166*6f5dc8baSWill Pazner    dXdxdXdxT = getvoigt(@view(qdata[i,:]), CeedDim(dim))
167*6f5dc8baSWill Pazner    dui = SVector{dim}(@view(du[i,:]))
168*6f5dc8baSWill Pazner    dv[i,:] .= dXdxdXdxT*dui
169*6f5dc8baSWill Paznerend
170*6f5dc8baSWill Pazner```
171*6f5dc8baSWill PaznerFirst, the matrix $w \det(J) J^{-\intercal} J^{-1}$ is stored in the variable
172*6f5dc8baSWill Pazner`dXdxdXdxT`. The symmetric entries of this matrix are accesed using
173*6f5dc8baSWill Pazner`@view(qdata[i,:])`, which avoids allocations. [`getvoigt`](@ref) is used to
174*6f5dc8baSWill Paznerconvert from Voigt notation to a symmetric matrix, which returns a statically
175*6f5dc8baSWill Paznersized `SMatrix`. The version for the correct spatial dimension is selected using
176*6f5dc8baSWill Pazner`CeedDim(dim)`, which allows for compile-time dispatch, since `dim` is a
177*6f5dc8baSWill Paznerconstant whose value is known as a constant when the Q-function is JIT compiled.
178*6f5dc8baSWill Pazner
179*6f5dc8baSWill PaznerThen, the gradient of $u$ at the given quadrature point is loaded as a
180*6f5dc8baSWill Paznerfixed-size `SVector`. The result is placed into the output array, where the
181*6f5dc8baSWill Pazner[`StaticArrays.jl`](https://github.com/JuliaArrays/StaticArrays.jl) package
182*6f5dc8baSWill Paznerevaluates `dXdxdXdxT*dui` using an optimized matrix-vector product for small
183*6f5dc8baSWill Paznermatrices (since their sizes are known statically).
184*6f5dc8baSWill Pazner
185*6f5dc8baSWill Pazner## GPU Kernels
186*6f5dc8baSWill Pazner
187*6f5dc8baSWill PaznerIf the `Ceed` resource uses a CUDA backend, then the user Q-functions defined
188*6f5dc8baSWill Paznerusing [`@interior_qf`](@ref) are automatically compiled as CUDA kernels using
189*6f5dc8baSWill Pazner[`CUDA.jl`](https://github.com/JuliaGPU/CUDA.jl). Some Julia features are not
190*6f5dc8baSWill Pazneravailable in GPU code (for example, dynamic dispatch), so if the Q-function is
191*6f5dc8baSWill Paznerintended to be run on the GPU, the user should take care when defining the body
192*6f5dc8baSWill Paznerof the user Q-function.
193