xref: /libCEED/README.md (revision d3fde3fb9d61b3a93056afce9f581369dffb8cf2)
1bcb2dfaeSJed Brown# libCEED: Efficient Extensible Discretization
2bcb2dfaeSJed Brown
3*d3fde3fbSJed Brown[![GitHub Actions][github-badge]][github-link]
4*d3fde3fbSJed Brown[![GitLab-CI][gitlab-badge]][gitlab-link]
5*d3fde3fbSJed Brown[![Azure Pipelines][azure-badge]][azure-link]
6*d3fde3fbSJed Brown[![Code coverage][codecov-badge]][codecov-link]
7*d3fde3fbSJed Brown[![BSD-2-Clause][license-badge]][license-link]
8*d3fde3fbSJed Brown[![Documentation][doc-badge]][doc-link]
9*d3fde3fbSJed Brown[![JOSS paper][joss-badge]][joss-link]
10*d3fde3fbSJed Brown[![Binder][binder-badge]][binder-link]
11bcb2dfaeSJed Brown
12bcb2dfaeSJed Brown## Summary and Purpose
13bcb2dfaeSJed Brown
14bcb2dfaeSJed BrownlibCEED provides fast algebra for element-based discretizations, designed for
15bcb2dfaeSJed Brownperformance portability, run-time flexibility, and clean embedding in higher
16bcb2dfaeSJed Brownlevel libraries and applications. It offers a C99 interface as well as bindings
17bcb2dfaeSJed Brownfor Fortran, Python, Julia, and Rust.
18bcb2dfaeSJed BrownWhile our focus is on high-order finite elements, the approach is mostly
19bcb2dfaeSJed Brownalgebraic and thus applicable to other discretizations in factored form, as
20bcb2dfaeSJed Brownexplained in the [user manual](https://libceed.readthedocs.io/en/latest/) and
21bcb2dfaeSJed BrownAPI implementation portion of the
22bcb2dfaeSJed Brown[documentation](https://libceed.readthedocs.io/en/latest/api/).
23bcb2dfaeSJed Brown
24bcb2dfaeSJed BrownOne of the challenges with high-order methods is that a global sparse matrix is
25bcb2dfaeSJed Brownno longer a good representation of a high-order linear operator, both with
26bcb2dfaeSJed Brownrespect to the FLOPs needed for its evaluation, as well as the memory transfer
27bcb2dfaeSJed Brownneeded for a matvec.  Thus, high-order methods require a new "format" that still
28bcb2dfaeSJed Brownrepresents a linear (or more generally non-linear) operator, but not through a
29bcb2dfaeSJed Brownsparse matrix.
30bcb2dfaeSJed Brown
31bcb2dfaeSJed BrownThe goal of libCEED is to propose such a format, as well as supporting
32bcb2dfaeSJed Brownimplementations and data structures, that enable efficient operator evaluation
33bcb2dfaeSJed Brownon a variety of computational device types (CPUs, GPUs, etc.). This new operator
34bcb2dfaeSJed Browndescription is based on algebraically
35bcb2dfaeSJed Brown[factored form](https://libceed.readthedocs.io/en/latest/libCEEDapi/#finite-element-operator-decomposition),
36bcb2dfaeSJed Brownwhich is easy to incorporate in a wide variety of applications, without significant
37bcb2dfaeSJed Brownrefactoring of their own discretization infrastructure.
38bcb2dfaeSJed Brown
39bcb2dfaeSJed BrownThe repository is part of the
40bcb2dfaeSJed Brown[CEED software suite](http://ceed.exascaleproject.org/software/), a collection of
41bcb2dfaeSJed Brownsoftware benchmarks, miniapps, libraries and APIs for efficient exascale
42bcb2dfaeSJed Browndiscretizations based on high-order finite element and spectral element methods.
43bcb2dfaeSJed BrownSee <http://github.com/ceed> for more information and source code availability.
44bcb2dfaeSJed Brown
45bcb2dfaeSJed BrownThe CEED research is supported by the
46bcb2dfaeSJed Brown[Exascale Computing Project](https://exascaleproject.org/exascale-computing-project)
47bcb2dfaeSJed Brown(17-SC-20-SC), a collaborative effort of two U.S. Department of Energy
48bcb2dfaeSJed Brownorganizations (Office of Science and the National Nuclear Security
49bcb2dfaeSJed BrownAdministration) responsible for the planning and preparation of a
50bcb2dfaeSJed Brown[capable exascale ecosystem](https://exascaleproject.org/what-is-exascale), including
51bcb2dfaeSJed Brownsoftware, applications, hardware, advanced system engineering and early testbed
52bcb2dfaeSJed Brownplatforms, in support of the nation’s exascale computing imperative.
53bcb2dfaeSJed Brown
54bcb2dfaeSJed BrownFor more details on the CEED API see the [user manual](https://libceed.readthedocs.io/en/latest/).
55bcb2dfaeSJed Brown
56bcb2dfaeSJed Brown% gettingstarted-inclusion-marker
57bcb2dfaeSJed Brown
58bcb2dfaeSJed Brown## Building
59bcb2dfaeSJed Brown
60bcb2dfaeSJed BrownThe CEED library, `libceed`, is a C99 library with no required dependencies, and
61bcb2dfaeSJed Brownwith Fortran, Python, Julia, and Rust interfaces.  It can be built using:
62bcb2dfaeSJed Brown
63bcb2dfaeSJed Brown```
64bcb2dfaeSJed Brownmake
65bcb2dfaeSJed Brown```
66bcb2dfaeSJed Brown
67bcb2dfaeSJed Brownor, with optimization flags:
68bcb2dfaeSJed Brown
69bcb2dfaeSJed Brown```
70bcb2dfaeSJed Brownmake OPT='-O3 -march=skylake-avx512 -ffp-contract=fast'
71bcb2dfaeSJed Brown```
72bcb2dfaeSJed Brown
73bcb2dfaeSJed BrownThese optimization flags are used by all languages (C, C++, Fortran) and this
74bcb2dfaeSJed Brownmakefile variable can also be set for testing and examples (below).
75bcb2dfaeSJed Brown
76bcb2dfaeSJed BrownThe library attempts to automatically detect support for the AVX
77bcb2dfaeSJed Browninstruction set using gcc-style compiler options for the host.
78bcb2dfaeSJed BrownSupport may need to be manually specified via:
79bcb2dfaeSJed Brown
80bcb2dfaeSJed Brown```
81bcb2dfaeSJed Brownmake AVX=1
82bcb2dfaeSJed Brown```
83bcb2dfaeSJed Brown
84bcb2dfaeSJed Brownor:
85bcb2dfaeSJed Brown
86bcb2dfaeSJed Brown```
87bcb2dfaeSJed Brownmake AVX=0
88bcb2dfaeSJed Brown```
89bcb2dfaeSJed Brown
90bcb2dfaeSJed Brownif your compiler does not support gcc-style options, if you are cross
91bcb2dfaeSJed Browncompiling, etc.
92bcb2dfaeSJed Brown
93bcb2dfaeSJed BrownTo enable CUDA support, add `CUDA_DIR=/opt/cuda` or an appropriate directory
94bcb2dfaeSJed Brownto your `make` invocation. To enable HIP support, add `HIP_DIR=/opt/rocm` or
95bcb2dfaeSJed Brownan appropriate directory. To store these or other arguments as defaults for
96bcb2dfaeSJed Brownfuture invocations of `make`, use:
97bcb2dfaeSJed Brown
98bcb2dfaeSJed Brown```
99bcb2dfaeSJed Brownmake configure CUDA_DIR=/usr/local/cuda HIP_DIR=/opt/rocm OPT='-O3 -march=znver2'
100bcb2dfaeSJed Brown```
101bcb2dfaeSJed Brown
102bcb2dfaeSJed Brownwhich stores these variables in `config.mk`.
103bcb2dfaeSJed Brown
104bcb2dfaeSJed Brown## Additional Language Interfaces
105bcb2dfaeSJed Brown
106bcb2dfaeSJed BrownThe Fortran interface is built alongside the library automatically.
107bcb2dfaeSJed Brown
108bcb2dfaeSJed BrownPython users can install using:
109bcb2dfaeSJed Brown
110bcb2dfaeSJed Brown```
111bcb2dfaeSJed Brownpip install libceed
112bcb2dfaeSJed Brown```
113bcb2dfaeSJed Brown
114bcb2dfaeSJed Brownor in a clone of the repository via `pip install .`.
115bcb2dfaeSJed Brown
116bcb2dfaeSJed BrownJulia users can install using:
117bcb2dfaeSJed Brown
118bcb2dfaeSJed Brown```
119bcb2dfaeSJed Brown$ julia
120bcb2dfaeSJed Brownjulia> ]
121bcb2dfaeSJed Brownpkg> add LibCEED
122bcb2dfaeSJed Brown```
123bcb2dfaeSJed Brown
124bcb2dfaeSJed Brownin the Julia package manager or in a clone of the repository via:
125bcb2dfaeSJed Brown
126bcb2dfaeSJed Brown```
127bcb2dfaeSJed BrownJULIA_LIBCEED_LIB=/path/to/libceed.so julia
128bcb2dfaeSJed Brownjulia> # press ] to enter package manager
129bcb2dfaeSJed Brown(env) pkg> build LibCEED
130bcb2dfaeSJed Brown```
131bcb2dfaeSJed Brown
132bcb2dfaeSJed BrownRust users can include libCEED via `Cargo.toml`:
133bcb2dfaeSJed Brown
134bcb2dfaeSJed Brown```toml
135bcb2dfaeSJed Brown[dependencies]
136bcb2dfaeSJed Brownlibceed = { git = "https://github.com/CEED/libCEED", branch = "main" }
137bcb2dfaeSJed Brown```
138bcb2dfaeSJed Brown
139bcb2dfaeSJed BrownSee the [Cargo documentation](https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html#specifying-dependencies-from-git-repositories) for details.
140bcb2dfaeSJed Brown
141bcb2dfaeSJed Brown## Testing
142bcb2dfaeSJed Brown
143bcb2dfaeSJed BrownThe test suite produces [TAP](https://testanything.org) output and is run by:
144bcb2dfaeSJed Brown
145bcb2dfaeSJed Brown```
146bcb2dfaeSJed Brownmake test
147bcb2dfaeSJed Brown```
148bcb2dfaeSJed Brown
149bcb2dfaeSJed Brownor, using the `prove` tool distributed with Perl (recommended):
150bcb2dfaeSJed Brown
151bcb2dfaeSJed Brown```
152bcb2dfaeSJed Brownmake prove
153bcb2dfaeSJed Brown```
154bcb2dfaeSJed Brown
155bcb2dfaeSJed Brown## Backends
156bcb2dfaeSJed Brown
157bcb2dfaeSJed BrownThere are multiple supported backends, which can be selected at runtime in the examples:
158bcb2dfaeSJed Brown
159bcb2dfaeSJed Brown| CEED resource              | Backend                                           | Deterministic Capable |
160*d3fde3fbSJed Brown| :---                       | :---                                              | :---:                 |
161*d3fde3fbSJed Brown||
162*d3fde3fbSJed Brown| **CPU Native**             |
163*d3fde3fbSJed Brown| `/cpu/self/ref/serial`     | Serial reference implementation                   | Yes                   |
164*d3fde3fbSJed Brown| `/cpu/self/ref/blocked`    | Blocked reference implementation                  | Yes                   |
165*d3fde3fbSJed Brown| `/cpu/self/opt/serial`     | Serial optimized C implementation                 | Yes                   |
166*d3fde3fbSJed Brown| `/cpu/self/opt/blocked`    | Blocked optimized C implementation                | Yes                   |
167*d3fde3fbSJed Brown| `/cpu/self/avx/serial`     | Serial AVX implementation                         | Yes                   |
168*d3fde3fbSJed Brown| `/cpu/self/avx/blocked`    | Blocked AVX implementation                        | Yes                   |
169*d3fde3fbSJed Brown||
170*d3fde3fbSJed Brown| **CPU Valgrind**           |
171*d3fde3fbSJed Brown| `/cpu/self/memcheck/*`     | Memcheck backends, undefined value checks         | Yes                   |
172*d3fde3fbSJed Brown||
173*d3fde3fbSJed Brown| **CPU LIBXSMM**            |
174*d3fde3fbSJed Brown| `/cpu/self/xsmm/serial`    | Serial LIBXSMM implementation                     | Yes                   |
175*d3fde3fbSJed Brown| `/cpu/self/xsmm/blocked`   | Blocked LIBXSMM implementation                    | Yes                   |
176*d3fde3fbSJed Brown||
177*d3fde3fbSJed Brown| **CUDA Native**            |
178*d3fde3fbSJed Brown| `/gpu/cuda/ref`            | Reference pure CUDA kernels                       | Yes                   |
179*d3fde3fbSJed Brown| `/gpu/cuda/shared`         | Optimized pure CUDA kernels using shared memory   | Yes                   |
180*d3fde3fbSJed Brown| `/gpu/cuda/gen`            | Optimized pure CUDA kernels using code generation | No                    |
181*d3fde3fbSJed Brown||
182*d3fde3fbSJed Brown| **HIP Native**             |
183*d3fde3fbSJed Brown| `/gpu/hip/ref`             | Reference pure HIP kernels                        | Yes                   |
184*d3fde3fbSJed Brown| `/gpu/hip/shared`          | Optimized pure HIP kernels using shared memory    | Yes                   |
185*d3fde3fbSJed Brown| `/gpu/hip/gen`             | Optimized pure HIP kernels using code generation  | No                    |
186*d3fde3fbSJed Brown||
187*d3fde3fbSJed Brown| **MAGMA**                  |
188*d3fde3fbSJed Brown| `/gpu/cuda/magma`          | CUDA MAGMA kernels                                | No                    |
189*d3fde3fbSJed Brown| `/gpu/cuda/magma/det`      | CUDA MAGMA kernels                                | Yes                   |
190*d3fde3fbSJed Brown| `/gpu/hip/magma`           | HIP MAGMA kernels                                 | No                    |
191*d3fde3fbSJed Brown| `/gpu/hip/magma/det`       | HIP MAGMA kernels                                 | Yes                   |
192*d3fde3fbSJed Brown||
193*d3fde3fbSJed Brown| **OCCA**                   |
194*d3fde3fbSJed Brown| `/*/occa`                  | Selects backend based on available OCCA modes     | Yes                   |
195*d3fde3fbSJed Brown| `/cpu/self/occa`           | OCCA backend with serial CPU kernels              | Yes                   |
196*d3fde3fbSJed Brown| `/cpu/openmp/occa`         | OCCA backend with OpenMP kernels                  | Yes                   |
197*d3fde3fbSJed Brown| `/gpu/cuda/occa`           | OCCA backend with CUDA kernels                    | Yes                   |
198*d3fde3fbSJed Brown| `/gpu/hip/occa`~           | OCCA backend with HIP kernels                     | Yes                   |
199bcb2dfaeSJed Brown
200bcb2dfaeSJed BrownThe `/cpu/self/*/serial` backends process one element at a time and are intended for meshes
201bcb2dfaeSJed Brownwith a smaller number of high order elements. The `/cpu/self/*/blocked` backends process
202bcb2dfaeSJed Brownblocked batches of eight interlaced elements and are intended for meshes with higher numbers
203bcb2dfaeSJed Brownof elements.
204bcb2dfaeSJed Brown
205bcb2dfaeSJed BrownThe `/cpu/self/ref/*` backends are written in pure C and provide basic functionality.
206bcb2dfaeSJed Brown
207bcb2dfaeSJed BrownThe `/cpu/self/opt/*` backends are written in pure C and use partial e-vectors to improve performance.
208bcb2dfaeSJed Brown
209bcb2dfaeSJed BrownThe `/cpu/self/avx/*` backends rely upon AVX instructions to provide vectorized CPU performance.
210bcb2dfaeSJed Brown
211bcb2dfaeSJed BrownThe `/cpu/self/memcheck/*` backends rely upon the [Valgrind](http://valgrind.org/) Memcheck tool
212bcb2dfaeSJed Brownto help verify that user QFunctions have no undefined values. To use, run your code with
213bcb2dfaeSJed BrownValgrind and the Memcheck backends, e.g. `valgrind ./build/ex1 -ceed /cpu/self/ref/memcheck`. A
214bcb2dfaeSJed Brown'development' or 'debugging' version of Valgrind with headers is required to use this backend.
215bcb2dfaeSJed BrownThis backend can be run in serial or blocked mode and defaults to running in the serial mode
216bcb2dfaeSJed Brownif `/cpu/self/memcheck` is selected at runtime.
217bcb2dfaeSJed Brown
218bcb2dfaeSJed BrownThe `/cpu/self/xsmm/*` backends rely upon the [LIBXSMM](http://github.com/hfp/libxsmm) package
219bcb2dfaeSJed Brownto provide vectorized CPU performance. If linking MKL and LIBXSMM is desired but
220bcb2dfaeSJed Brownthe Makefile is not detecting `MKLROOT`, linking libCEED against MKL can be
221bcb2dfaeSJed Brownforced by setting the environment variable `MKL=1`.
222bcb2dfaeSJed Brown
223bcb2dfaeSJed BrownThe `/gpu/cuda/*` backends provide GPU performance strictly using CUDA.
224bcb2dfaeSJed Brown
225bcb2dfaeSJed BrownThe `/gpu/hip/*` backends provide GPU performance strictly using HIP. They are based on
226bcb2dfaeSJed Brownthe `/gpu/cuda/*` backends.  ROCm version 3.6 or newer is required.
227bcb2dfaeSJed Brown
228bcb2dfaeSJed BrownThe `/gpu/*/magma/*` backends rely upon the [MAGMA](https://bitbucket.org/icl/magma) package.
229bcb2dfaeSJed BrownTo enable the MAGMA backends, the environment variable `MAGMA_DIR` must point to the top-level
230bcb2dfaeSJed BrownMAGMA directory, with the MAGMA library located in `$(MAGMA_DIR)/lib/`.
231bcb2dfaeSJed BrownBy default, `MAGMA_DIR` is set to `../magma`; to build the MAGMA backends
232bcb2dfaeSJed Brownwith a MAGMA installation located elsewhere, create a link to `magma/` in libCEED's parent
233bcb2dfaeSJed Browndirectory, or set `MAGMA_DIR` to the proper location.  MAGMA version 2.5.0 or newer is required.
234bcb2dfaeSJed BrownCurrently, each MAGMA library installation is only built for either CUDA or HIP.  The corresponding
235bcb2dfaeSJed Brownset of libCEED backends (`/gpu/cuda/magma/*` or `/gpu/hip/magma/*`) will automatically be built
236bcb2dfaeSJed Brownfor the version of the MAGMA library found in `MAGMA_DIR`.
237bcb2dfaeSJed Brown
238bcb2dfaeSJed BrownUsers can specify a device for all CUDA, HIP, and MAGMA backends through adding `:device_id=#`
239bcb2dfaeSJed Brownafter the resource name.  For example:
240bcb2dfaeSJed Brown
241bcb2dfaeSJed Brown> - `/gpu/cuda/gen:device_id=1`
242bcb2dfaeSJed Brown
243bcb2dfaeSJed BrownThe `/*/occa` backends rely upon the [OCCA](http://github.com/libocca/occa) package to provide
244bcb2dfaeSJed Browncross platform performance. To enable the OCCA backend, the environment variable `OCCA_DIR` must point
245bcb2dfaeSJed Brownto the top-level OCCA directory, with the OCCA library located in the `${OCCA_DIR}/lib` (By default,
246bcb2dfaeSJed Brown`OCCA_DIR` is set to `../occa`).
247bcb2dfaeSJed Brown
248bcb2dfaeSJed BrownAdditionally, users can pass specific OCCA device properties after setting the CEED resource.
249bcb2dfaeSJed BrownFor example:
250bcb2dfaeSJed Brown
251bcb2dfaeSJed Brown> - `"/*/occa:mode='CUDA',device_id=0"`
252bcb2dfaeSJed Brown
253bcb2dfaeSJed BrownBit-for-bit reproducibility is important in some applications.
254bcb2dfaeSJed BrownHowever, some libCEED backends use non-deterministic operations, such as `atomicAdd` for increased performance.
255bcb2dfaeSJed BrownThe backends which are capable of generating reproducible results, with the proper compilation options, are highlighted in the list above.
256bcb2dfaeSJed Brown
257bcb2dfaeSJed Brown## Examples
258bcb2dfaeSJed Brown
259bcb2dfaeSJed BrownlibCEED comes with several examples of its usage, ranging from standalone C
260bcb2dfaeSJed Browncodes in the `/examples/ceed` directory to examples based on external packages,
261bcb2dfaeSJed Brownsuch as MFEM, PETSc, and Nek5000. Nek5000 v18.0 or greater is required.
262bcb2dfaeSJed Brown
263bcb2dfaeSJed BrownTo build the examples, set the `MFEM_DIR`, `PETSC_DIR`, and
264bcb2dfaeSJed Brown`NEK5K_DIR` variables and run:
265bcb2dfaeSJed Brown
266bcb2dfaeSJed Brown```
267bcb2dfaeSJed Browncd examples/
268bcb2dfaeSJed Brown```
269bcb2dfaeSJed Brown
270bcb2dfaeSJed Brown% running-examples-inclusion-marker
271bcb2dfaeSJed Brown
272bcb2dfaeSJed Brown```console
273bcb2dfaeSJed Brown# libCEED examples on CPU and GPU
274bcb2dfaeSJed Browncd ceed/
275bcb2dfaeSJed Brownmake
276bcb2dfaeSJed Brown./ex1-volume -ceed /cpu/self
277bcb2dfaeSJed Brown./ex1-volume -ceed /gpu/cuda
278bcb2dfaeSJed Brown./ex2-surface -ceed /cpu/self
279bcb2dfaeSJed Brown./ex2-surface -ceed /gpu/cuda
280bcb2dfaeSJed Browncd ..
281bcb2dfaeSJed Brown
282bcb2dfaeSJed Brown# MFEM+libCEED examples on CPU and GPU
283bcb2dfaeSJed Browncd mfem/
284bcb2dfaeSJed Brownmake
285bcb2dfaeSJed Brown./bp1 -ceed /cpu/self -no-vis
286bcb2dfaeSJed Brown./bp3 -ceed /gpu/cuda -no-vis
287bcb2dfaeSJed Browncd ..
288bcb2dfaeSJed Brown
289bcb2dfaeSJed Brown# Nek5000+libCEED examples on CPU and GPU
290bcb2dfaeSJed Browncd nek/
291bcb2dfaeSJed Brownmake
292bcb2dfaeSJed Brown./nek-examples.sh -e bp1 -ceed /cpu/self -b 3
293bcb2dfaeSJed Brown./nek-examples.sh -e bp3 -ceed /gpu/cuda -b 3
294bcb2dfaeSJed Browncd ..
295bcb2dfaeSJed Brown
296bcb2dfaeSJed Brown# PETSc+libCEED examples on CPU and GPU
297bcb2dfaeSJed Browncd petsc/
298bcb2dfaeSJed Brownmake
299bcb2dfaeSJed Brown./bps -problem bp1 -ceed /cpu/self
300bcb2dfaeSJed Brown./bps -problem bp2 -ceed /gpu/cuda
301bcb2dfaeSJed Brown./bps -problem bp3 -ceed /cpu/self
302bcb2dfaeSJed Brown./bps -problem bp4 -ceed /gpu/cuda
303bcb2dfaeSJed Brown./bps -problem bp5 -ceed /cpu/self
304bcb2dfaeSJed Brown./bps -problem bp6 -ceed /gpu/cuda
305bcb2dfaeSJed Browncd ..
306bcb2dfaeSJed Brown
307bcb2dfaeSJed Browncd petsc/
308bcb2dfaeSJed Brownmake
309bcb2dfaeSJed Brown./bpsraw -problem bp1 -ceed /cpu/self
310bcb2dfaeSJed Brown./bpsraw -problem bp2 -ceed /gpu/cuda
311bcb2dfaeSJed Brown./bpsraw -problem bp3 -ceed /cpu/self
312bcb2dfaeSJed Brown./bpsraw -problem bp4 -ceed /gpu/cuda
313bcb2dfaeSJed Brown./bpsraw -problem bp5 -ceed /cpu/self
314bcb2dfaeSJed Brown./bpsraw -problem bp6 -ceed /gpu/cuda
315bcb2dfaeSJed Browncd ..
316bcb2dfaeSJed Brown
317bcb2dfaeSJed Browncd petsc/
318bcb2dfaeSJed Brownmake
319bcb2dfaeSJed Brown./bpssphere -problem bp1 -ceed /cpu/self
320bcb2dfaeSJed Brown./bpssphere -problem bp2 -ceed /gpu/cuda
321bcb2dfaeSJed Brown./bpssphere -problem bp3 -ceed /cpu/self
322bcb2dfaeSJed Brown./bpssphere -problem bp4 -ceed /gpu/cuda
323bcb2dfaeSJed Brown./bpssphere -problem bp5 -ceed /cpu/self
324bcb2dfaeSJed Brown./bpssphere -problem bp6 -ceed /gpu/cuda
325bcb2dfaeSJed Browncd ..
326bcb2dfaeSJed Brown
327bcb2dfaeSJed Browncd petsc/
328bcb2dfaeSJed Brownmake
329bcb2dfaeSJed Brown./area -problem cube -ceed /cpu/self -degree 3
330bcb2dfaeSJed Brown./area -problem cube -ceed /gpu/cuda -degree 3
331bcb2dfaeSJed Brown./area -problem sphere -ceed /cpu/self -degree 3 -dm_refine 2
332bcb2dfaeSJed Brown./area -problem sphere -ceed /gpu/cuda -degree 3 -dm_refine 2
333bcb2dfaeSJed Brown
334bcb2dfaeSJed Browncd fluids/
335bcb2dfaeSJed Brownmake
336bcb2dfaeSJed Brown./navierstokes -ceed /cpu/self -degree 1
337bcb2dfaeSJed Brown./navierstokes -ceed /gpu/cuda -degree 1
338bcb2dfaeSJed Browncd ..
339bcb2dfaeSJed Brown
340bcb2dfaeSJed Browncd solids/
341bcb2dfaeSJed Brownmake
342bcb2dfaeSJed Brown./elasticity -ceed /cpu/self -mesh [.exo file] -degree 2 -E 1 -nu 0.3 -problem Linear -forcing mms
343bcb2dfaeSJed Brown./elasticity -ceed /gpu/cuda -mesh [.exo file] -degree 2 -E 1 -nu 0.3 -problem Linear -forcing mms
344bcb2dfaeSJed Browncd ..
345bcb2dfaeSJed Brown```
346bcb2dfaeSJed Brown
347bcb2dfaeSJed BrownFor the last example shown, sample meshes to be used in place of
348bcb2dfaeSJed Brown`[.exo file]` can be found at <https://github.com/jeremylt/ceedSampleMeshes>
349bcb2dfaeSJed Brown
350bcb2dfaeSJed BrownThe above code assumes a GPU-capable machine with the OCCA backend
351bcb2dfaeSJed Brownenabled. Depending on the available backends, other CEED resource
352bcb2dfaeSJed Brownspecifiers can be provided with the `-ceed` option. Other command line
353bcb2dfaeSJed Brownarguments can be found in [examples/petsc](https://github.com/CEED/libCEED/blob/main/examples/petsc/README.md).
354bcb2dfaeSJed Brown
355bcb2dfaeSJed Brown% benchmarks-marker
356bcb2dfaeSJed Brown
357bcb2dfaeSJed Brown## Benchmarks
358bcb2dfaeSJed Brown
359bcb2dfaeSJed BrownA sequence of benchmarks for all enabled backends can be run using:
360bcb2dfaeSJed Brown
361bcb2dfaeSJed Brown```
362bcb2dfaeSJed Brownmake benchmarks
363bcb2dfaeSJed Brown```
364bcb2dfaeSJed Brown
365bcb2dfaeSJed BrownThe results from the benchmarks are stored inside the `benchmarks/` directory
366bcb2dfaeSJed Brownand they can be viewed using the commands (requires python with matplotlib):
367bcb2dfaeSJed Brown
368bcb2dfaeSJed Brown```
369bcb2dfaeSJed Browncd benchmarks
370bcb2dfaeSJed Brownpython postprocess-plot.py petsc-bps-bp1-*-output.txt
371bcb2dfaeSJed Brownpython postprocess-plot.py petsc-bps-bp3-*-output.txt
372bcb2dfaeSJed Brown```
373bcb2dfaeSJed Brown
374bcb2dfaeSJed BrownUsing the `benchmarks` target runs a comprehensive set of benchmarks which may
375bcb2dfaeSJed Browntake some time to run. Subsets of the benchmarks can be run using the scripts in the `benchmarks` folder.
376bcb2dfaeSJed Brown
377bcb2dfaeSJed BrownFor more details about the benchmarks, see the `benchmarks/README.md` file.
378bcb2dfaeSJed Brown
379bcb2dfaeSJed Brown## Install
380bcb2dfaeSJed Brown
381bcb2dfaeSJed BrownTo install libCEED, run:
382bcb2dfaeSJed Brown
383bcb2dfaeSJed Brown```
384bcb2dfaeSJed Brownmake install prefix=/usr/local
385bcb2dfaeSJed Brown```
386bcb2dfaeSJed Brown
387bcb2dfaeSJed Brownor (e.g., if creating packages):
388bcb2dfaeSJed Brown
389bcb2dfaeSJed Brown```
390bcb2dfaeSJed Brownmake install prefix=/usr DESTDIR=/packaging/path
391bcb2dfaeSJed Brown```
392bcb2dfaeSJed Brown
393bcb2dfaeSJed BrownThe usual variables like `CC` and `CFLAGS` are used, and optimization flags
394bcb2dfaeSJed Brownfor all languages can be set using the likes of `OPT='-O3 -march=native'`. Use
395bcb2dfaeSJed Brown`STATIC=1` to build static libraries (`libceed.a`).
396bcb2dfaeSJed Brown
397bcb2dfaeSJed BrownTo install libCEED for Python, run:
398bcb2dfaeSJed Brown
399bcb2dfaeSJed Brown```
400bcb2dfaeSJed Brownpip install libceed
401bcb2dfaeSJed Brown```
402bcb2dfaeSJed Brown
403bcb2dfaeSJed Brownwith the desired setuptools options, such as `--user`.
404bcb2dfaeSJed Brown
405bcb2dfaeSJed Brown### pkg-config
406bcb2dfaeSJed Brown
407bcb2dfaeSJed BrownIn addition to library and header, libCEED provides a [pkg-config](https://en.wikipedia.org/wiki/Pkg-config)
408bcb2dfaeSJed Brownfile that can be used to easily compile and link.
409bcb2dfaeSJed Brown[For example](https://people.freedesktop.org/~dbn/pkg-config-guide.html#faq), if
410bcb2dfaeSJed Brown`$prefix` is a standard location or you set the environment variable
411bcb2dfaeSJed Brown`PKG_CONFIG_PATH`:
412bcb2dfaeSJed Brown
413bcb2dfaeSJed Brown```
414bcb2dfaeSJed Browncc `pkg-config --cflags --libs ceed` -o myapp myapp.c
415bcb2dfaeSJed Brown```
416bcb2dfaeSJed Brown
417bcb2dfaeSJed Brownwill build `myapp` with libCEED.  This can be used with the source or
418bcb2dfaeSJed Browninstalled directories.  Most build systems have support for pkg-config.
419bcb2dfaeSJed Brown
420bcb2dfaeSJed Brown## Contact
421bcb2dfaeSJed Brown
422bcb2dfaeSJed BrownYou can reach the libCEED team by emailing [ceed-users@llnl.gov](mailto:ceed-users@llnl.gov)
423bcb2dfaeSJed Brownor by leaving a comment in the [issue tracker](https://github.com/CEED/libCEED/issues).
424bcb2dfaeSJed Brown
425bcb2dfaeSJed Brown## How to Cite
426bcb2dfaeSJed Brown
427bcb2dfaeSJed BrownIf you utilize libCEED please cite:
428bcb2dfaeSJed Brown
429bcb2dfaeSJed Brown```
430bcb2dfaeSJed Brown@article{libceed-joss-paper,
431bcb2dfaeSJed Brown  author       = {Jed Brown and Ahmad Abdelfattah and Valeria Barra and Natalie Beams and Jean Sylvain Camier and Veselin Dobrev and Yohann Dudouit and Leila Ghaffari and Tzanio Kolev and David Medina and Will Pazner and Thilina Ratnayaka and Jeremy Thompson and Stan Tomov},
432bcb2dfaeSJed Brown  title        = {{libCEED}: Fast algebra for high-order element-based discretizations},
433bcb2dfaeSJed Brown  journal      = {Journal of Open Source Software},
434bcb2dfaeSJed Brown  year         = {2021},
435bcb2dfaeSJed Brown  publisher    = {The Open Journal},
436bcb2dfaeSJed Brown  volume       = {6},
437bcb2dfaeSJed Brown  number       = {63},
438bcb2dfaeSJed Brown  pages        = {2945},
439bcb2dfaeSJed Brown  doi          = {10.21105/joss.02945}
440bcb2dfaeSJed Brown}
441bcb2dfaeSJed Brown
442bcb2dfaeSJed Brown@misc{libceed-user-manual,
443bcb2dfaeSJed Brown  author       = {Abdelfattah, Ahmad and
444bcb2dfaeSJed Brown                  Barra, Valeria and
445bcb2dfaeSJed Brown                  Beams, Natalie and
446bcb2dfaeSJed Brown                  Brown, Jed and
447bcb2dfaeSJed Brown                  Camier, Jean-Sylvain and
448bcb2dfaeSJed Brown                  Dobrev, Veselin and
449bcb2dfaeSJed Brown                  Dudouit, Yohann and
450bcb2dfaeSJed Brown                  Ghaffari, Leila and
451bcb2dfaeSJed Brown                  Kolev, Tzanio and
452bcb2dfaeSJed Brown                  Medina, David and
453bcb2dfaeSJed Brown                  Pazner, Will and
454bcb2dfaeSJed Brown                  Ratnayaka, Thilina and
455bcb2dfaeSJed Brown                  Thompson, Jeremy L and
456bcb2dfaeSJed Brown                  Tomov, Stanimire},
457bcb2dfaeSJed Brown  title        = {{libCEED} User Manual},
458bcb2dfaeSJed Brown  month        = jul,
459bcb2dfaeSJed Brown  year         = 2021,
460bcb2dfaeSJed Brown  publisher    = {Zenodo},
461bcb2dfaeSJed Brown  version      = {0.9.0},
462bcb2dfaeSJed Brown  doi          = {10.5281/zenodo.5077489}
463bcb2dfaeSJed Brown}
464bcb2dfaeSJed Brown```
465bcb2dfaeSJed Brown
466bcb2dfaeSJed BrownFor libCEED's Python interface please cite:
467bcb2dfaeSJed Brown
468bcb2dfaeSJed Brown```
469bcb2dfaeSJed Brown@InProceedings{libceed-paper-proc-scipy-2020,
470bcb2dfaeSJed Brown  author    = {{V}aleria {B}arra and {J}ed {B}rown and {J}eremy {T}hompson and {Y}ohann {D}udouit},
471bcb2dfaeSJed Brown  title     = {{H}igh-performance operator evaluations with ease of use: lib{C}{E}{E}{D}'s {P}ython interface},
472bcb2dfaeSJed Brown  booktitle = {{P}roceedings of the 19th {P}ython in {S}cience {C}onference},
473bcb2dfaeSJed Brown  pages     = {85 - 90},
474bcb2dfaeSJed Brown  year      = {2020},
475bcb2dfaeSJed Brown  editor    = {{M}eghann {A}garwal and {C}hris {C}alloway and {D}illon {N}iederhut and {D}avid {S}hupe},
476bcb2dfaeSJed Brown  doi       = {10.25080/Majora-342d178e-00c}
477bcb2dfaeSJed Brown}
478bcb2dfaeSJed Brown```
479bcb2dfaeSJed Brown
480bcb2dfaeSJed BrownThe BiBTeX entries for these references can be found in the
481bcb2dfaeSJed Brown`doc/bib/references.bib` file.
482bcb2dfaeSJed Brown
483bcb2dfaeSJed Brown## Copyright
484bcb2dfaeSJed Brown
485bcb2dfaeSJed BrownThe following copyright applies to each file in the CEED software suite, unless
486bcb2dfaeSJed Brownotherwise stated in the file:
487bcb2dfaeSJed Brown
488bcb2dfaeSJed Brown> Copyright (c) 2017, Lawrence Livermore National Security, LLC. Produced at the
489bcb2dfaeSJed Brown> Lawrence Livermore National Laboratory. LLNL-CODE-734707. All Rights reserved.
490bcb2dfaeSJed Brown
491bcb2dfaeSJed BrownSee files LICENSE and NOTICE for details.
492*d3fde3fbSJed Brown
493*d3fde3fbSJed Brown[github-badge]: https://github.com/CEED/libCEED/workflows/C/Fortran/badge.svg
494*d3fde3fbSJed Brown[github-link]: https://github.com/CEED/libCEED/actions
495*d3fde3fbSJed Brown[gitlab-badge]: https://gitlab.com/libceed/libCEED/badges/main/pipeline.svg?key_text=GitLab-CI
496*d3fde3fbSJed Brown[gitlab-link]: https://gitlab.com/libceed/libCEED/-/pipelines?page=1&scope=all&ref=main
497*d3fde3fbSJed Brown[azure-badge]: https://dev.azure.com/CEED-ECP/libCEED/_apis/build/status/CEED.libCEED?branchName=main
498*d3fde3fbSJed Brown[azure-link]: https://dev.azure.com/CEED-ECP/libCEED/_build?definitionId=2
499*d3fde3fbSJed Brown[codecov-badge]: https://codecov.io/gh/CEED/libCEED/branch/main/graphs/badge.svg
500*d3fde3fbSJed Brown[codecov-link]: https://codecov.io/gh/CEED/libCEED/
501*d3fde3fbSJed Brown[license-badge]: https://img.shields.io/badge/License-BSD%202--Clause-orange.svg
502*d3fde3fbSJed Brown[license-link]: https://opensource.org/licenses/BSD-2-Clause
503*d3fde3fbSJed Brown[doc-badge]: https://readthedocs.org/projects/libceed/badge/?version=latest
504*d3fde3fbSJed Brown[doc-link]: https://libceed.readthedocs.io/en/latest/?badge=latest
505*d3fde3fbSJed Brown[joss-badge]: https://joss.theoj.org/papers/10.21105/joss.02945/status.svg
506*d3fde3fbSJed Brown[joss-link]: https://doi.org/10.21105/joss.02945
507*d3fde3fbSJed Brown[binder-badge]: http://mybinder.org/badge_logo.svg
508*d3fde3fbSJed Brown[binder-link]: https://mybinder.org/v2/gh/CEED/libCEED/main?urlpath=lab/tree/examples/tutorials/tutorial-0-ceed.ipynb
509