xref: /libCEED/README.md (revision 87a4ead5b6332b6efac4901d666934852cf1afd5)
1bcb2dfaeSJed Brown# libCEED: Efficient Extensible Discretization
2bcb2dfaeSJed Brown
3d3fde3fbSJed Brown[![GitHub Actions][github-badge]][github-link]
4d3fde3fbSJed Brown[![GitLab-CI][gitlab-badge]][gitlab-link]
5d3fde3fbSJed Brown[![Code coverage][codecov-badge]][codecov-link]
6d3fde3fbSJed Brown[![BSD-2-Clause][license-badge]][license-link]
7d3fde3fbSJed Brown[![Documentation][doc-badge]][doc-link]
8d3fde3fbSJed Brown[![JOSS paper][joss-badge]][joss-link]
9d3fde3fbSJed Brown[![Binder][binder-badge]][binder-link]
10bcb2dfaeSJed Brown
11bcb2dfaeSJed Brown## Summary and Purpose
12bcb2dfaeSJed Brown
1317be3a41SJeremy L ThompsonlibCEED provides fast algebra for element-based discretizations, designed for performance portability, run-time flexibility, and clean embedding in higher level libraries and applications.
1417be3a41SJeremy L ThompsonIt offers a C99 interface as well as bindings for Fortran, Python, Julia, and Rust.
1517be3a41SJeremy L ThompsonWhile our focus is on high-order finite elements, the approach is mostly algebraic and thus applicable to other discretizations in factored form, as explained in the [user manual](https://libceed.org/en/latest/) and API implementation portion of the [documentation](https://libceed.org/en/latest/api/).
16bcb2dfaeSJed Brown
1717be3a41SJeremy L ThompsonOne of the challenges with high-order methods is that a global sparse matrix is no longer a good representation of a high-order linear operator, both with respect to the FLOPs needed for its evaluation, as well as the memory transfer needed for a matvec.
1817be3a41SJeremy L ThompsonThus, high-order methods require a new "format" that still represents a linear (or more generally non-linear) operator, but not through a sparse matrix.
19bcb2dfaeSJed Brown
2017be3a41SJeremy L ThompsonThe goal of libCEED is to propose such a format, as well as supporting implementations and data structures, that enable efficient operator evaluation on a variety of computational device types (CPUs, GPUs, etc.).
2117be3a41SJeremy L ThompsonThis new operator description is based on algebraically [factored form](https://libceed.org/en/latest/libCEEDapi/#finite-element-operator-decomposition), which is easy to incorporate in a wide variety of applications, without significant refactoring of their own discretization infrastructure.
22bcb2dfaeSJed Brown
2317be3a41SJeremy L ThompsonThe repository is part of the [CEED software suite](http://ceed.exascaleproject.org/software/), a collection of software benchmarks, miniapps, libraries and APIs for efficient exascale discretizations based on high-order finite element and spectral element methods.
24bcb2dfaeSJed BrownSee <http://github.com/ceed> for more information and source code availability.
25bcb2dfaeSJed Brown
2617be3a41SJeremy L ThompsonThe CEED research is supported by the [Exascale Computing Project](https://exascaleproject.org/exascale-computing-project) (17-SC-20-SC), a collaborative effort of two U.S. Department of Energy organizations (Office of Science and the National Nuclear Security Administration) responsible for the planning and preparation of a [capable exascale ecosystem](https://exascaleproject.org/what-is-exascale), including software, applications, hardware, advanced system engineering and early testbed platforms, in support of the nation’s exascale computing imperative.
27bcb2dfaeSJed Brown
2813964f07SJed BrownFor more details on the CEED API see the [user manual](https://libceed.org/en/latest/).
29bcb2dfaeSJed Brown
30bcb2dfaeSJed Brown% gettingstarted-inclusion-marker
31bcb2dfaeSJed Brown
32bcb2dfaeSJed Brown## Building
33bcb2dfaeSJed Brown
3417be3a41SJeremy L ThompsonThe CEED library, `libceed`, is a C99 library with no required dependencies, and with Fortran, Python, Julia, and Rust interfaces.
3517be3a41SJeremy L ThompsonIt can be built using:
36bcb2dfaeSJed Brown
37b648fd31SJed Brown```console
38b648fd31SJed Brown$ make
39bcb2dfaeSJed Brown```
40bcb2dfaeSJed Brown
41bcb2dfaeSJed Brownor, with optimization flags:
42bcb2dfaeSJed Brown
43b648fd31SJed Brown```console
44b648fd31SJed Brown$ make OPT='-O3 -march=skylake-avx512 -ffp-contract=fast'
45bcb2dfaeSJed Brown```
46bcb2dfaeSJed Brown
4717be3a41SJeremy L ThompsonThese optimization flags are used by all languages (C, C++, Fortran) and this makefile variable can also be set for testing and examples (below).
48bcb2dfaeSJed Brown
4917be3a41SJeremy L ThompsonThe library attempts to automatically detect support for the AVX instruction set using gcc-style compiler options for the host.
50bcb2dfaeSJed BrownSupport may need to be manually specified via:
51bcb2dfaeSJed Brown
52b648fd31SJed Brown```console
53b648fd31SJed Brown$ make AVX=1
54bcb2dfaeSJed Brown```
55bcb2dfaeSJed Brown
56bcb2dfaeSJed Brownor:
57bcb2dfaeSJed Brown
58b648fd31SJed Brown```console
59b648fd31SJed Brown$ make AVX=0
60bcb2dfaeSJed Brown```
61bcb2dfaeSJed Brown
6217be3a41SJeremy L Thompsonif your compiler does not support gcc-style options, if you are cross compiling, etc.
63bcb2dfaeSJed Brown
6417be3a41SJeremy L ThompsonTo enable CUDA support, add `CUDA_DIR=/opt/cuda` or an appropriate directory to your `make` invocation.
6517be3a41SJeremy L ThompsonTo enable HIP support, add `HIP_DIR=/opt/rocm` or an appropriate directory.
6617be3a41SJeremy L ThompsonTo store these or other arguments as defaults for future invocations of `make`, use:
67bcb2dfaeSJed Brown
68b648fd31SJed Brown```console
69b648fd31SJed Brown$ make configure CUDA_DIR=/usr/local/cuda HIP_DIR=/opt/rocm OPT='-O3 -march=znver2'
70bcb2dfaeSJed Brown```
71bcb2dfaeSJed Brown
72bcb2dfaeSJed Brownwhich stores these variables in `config.mk`.
73bcb2dfaeSJed Brown
74b648fd31SJed Brown### WebAssembly
75b648fd31SJed Brown
76b648fd31SJed BrownlibCEED can be built for WASM using [Emscripten](https://emscripten.org). For example, one can build the library and run a standalone WASM executable using
77b648fd31SJed Brown
78b648fd31SJed Brown``` console
79b648fd31SJed Brown$ emmake make build/ex2-surface.wasm
80b648fd31SJed Brown$ wasmer build/ex2-surface.wasm -- -s 200000
81b648fd31SJed Brown```
82b648fd31SJed Brown
83bcb2dfaeSJed Brown## Additional Language Interfaces
84bcb2dfaeSJed Brown
85bcb2dfaeSJed BrownThe Fortran interface is built alongside the library automatically.
86bcb2dfaeSJed Brown
87bcb2dfaeSJed BrownPython users can install using:
88bcb2dfaeSJed Brown
89b648fd31SJed Brown```console
90b648fd31SJed Brown$ pip install libceed
91bcb2dfaeSJed Brown```
92bcb2dfaeSJed Brown
93bcb2dfaeSJed Brownor in a clone of the repository via `pip install .`.
94bcb2dfaeSJed Brown
95bcb2dfaeSJed BrownJulia users can install using:
96bcb2dfaeSJed Brown
97b648fd31SJed Brown```console
98bcb2dfaeSJed Brown$ julia
99bcb2dfaeSJed Brownjulia> ]
100bcb2dfaeSJed Brownpkg> add LibCEED
101bcb2dfaeSJed Brown```
102bcb2dfaeSJed Brown
10317be3a41SJeremy L ThompsonSee the [LibCEED.jl documentation](http://ceed.exascaleproject.org/libCEED-julia-docs/dev/) for more information.
104bcb2dfaeSJed Brown
105bcb2dfaeSJed BrownRust users can include libCEED via `Cargo.toml`:
106bcb2dfaeSJed Brown
107bcb2dfaeSJed Brown```toml
108bcb2dfaeSJed Brown[dependencies]
1098ec64e9aSJed Brownlibceed = "0.11.0"
110bcb2dfaeSJed Brown```
111bcb2dfaeSJed Brown
112bcb2dfaeSJed BrownSee the [Cargo documentation](https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html#specifying-dependencies-from-git-repositories) for details.
113bcb2dfaeSJed Brown
114bcb2dfaeSJed Brown## Testing
115bcb2dfaeSJed Brown
116bcb2dfaeSJed BrownThe test suite produces [TAP](https://testanything.org) output and is run by:
117bcb2dfaeSJed Brown
118f11332b8SJeremy L Thompson```console
119f11332b8SJeremy L Thompson$ make test
120bcb2dfaeSJed Brown```
121bcb2dfaeSJed Brown
122bcb2dfaeSJed Brownor, using the `prove` tool distributed with Perl (recommended):
123bcb2dfaeSJed Brown
124f11332b8SJeremy L Thompson```console
125f11332b8SJeremy L Thompson$ make prove
126bcb2dfaeSJed Brown```
127bcb2dfaeSJed Brown
128bcb2dfaeSJed Brown## Backends
129bcb2dfaeSJed Brown
130bcb2dfaeSJed BrownThere are multiple supported backends, which can be selected at runtime in the examples:
131bcb2dfaeSJed Brown
132bcb2dfaeSJed Brown| CEED resource              | Backend                                           | Deterministic Capable |
133d3fde3fbSJed Brown| :---                       | :---                                              | :---:                 |
134d3fde3fbSJed Brown||
135d3fde3fbSJed Brown| **CPU Native**             |
136d3fde3fbSJed Brown| `/cpu/self/ref/serial`     | Serial reference implementation                   | Yes                   |
137d3fde3fbSJed Brown| `/cpu/self/ref/blocked`    | Blocked reference implementation                  | Yes                   |
138d3fde3fbSJed Brown| `/cpu/self/opt/serial`     | Serial optimized C implementation                 | Yes                   |
139d3fde3fbSJed Brown| `/cpu/self/opt/blocked`    | Blocked optimized C implementation                | Yes                   |
140d3fde3fbSJed Brown| `/cpu/self/avx/serial`     | Serial AVX implementation                         | Yes                   |
141d3fde3fbSJed Brown| `/cpu/self/avx/blocked`    | Blocked AVX implementation                        | Yes                   |
142d3fde3fbSJed Brown||
143d3fde3fbSJed Brown| **CPU Valgrind**           |
144d3fde3fbSJed Brown| `/cpu/self/memcheck/*`     | Memcheck backends, undefined value checks         | Yes                   |
145d3fde3fbSJed Brown||
146d3fde3fbSJed Brown| **CPU LIBXSMM**            |
147d3fde3fbSJed Brown| `/cpu/self/xsmm/serial`    | Serial LIBXSMM implementation                     | Yes                   |
148d3fde3fbSJed Brown| `/cpu/self/xsmm/blocked`   | Blocked LIBXSMM implementation                    | Yes                   |
149d3fde3fbSJed Brown||
150d3fde3fbSJed Brown| **CUDA Native**            |
151d3fde3fbSJed Brown| `/gpu/cuda/ref`            | Reference pure CUDA kernels                       | Yes                   |
152d3fde3fbSJed Brown| `/gpu/cuda/shared`         | Optimized pure CUDA kernels using shared memory   | Yes                   |
153d3fde3fbSJed Brown| `/gpu/cuda/gen`            | Optimized pure CUDA kernels using code generation | No                    |
154d3fde3fbSJed Brown||
155d3fde3fbSJed Brown| **HIP Native**             |
156d3fde3fbSJed Brown| `/gpu/hip/ref`             | Reference pure HIP kernels                        | Yes                   |
157d3fde3fbSJed Brown| `/gpu/hip/shared`          | Optimized pure HIP kernels using shared memory    | Yes                   |
158d3fde3fbSJed Brown| `/gpu/hip/gen`             | Optimized pure HIP kernels using code generation  | No                    |
159d3fde3fbSJed Brown||
160d3fde3fbSJed Brown| **MAGMA**                  |
161d3fde3fbSJed Brown| `/gpu/cuda/magma`          | CUDA MAGMA kernels                                | No                    |
162d3fde3fbSJed Brown| `/gpu/cuda/magma/det`      | CUDA MAGMA kernels                                | Yes                   |
163d3fde3fbSJed Brown| `/gpu/hip/magma`           | HIP MAGMA kernels                                 | No                    |
164d3fde3fbSJed Brown| `/gpu/hip/magma/det`       | HIP MAGMA kernels                                 | Yes                   |
165d3fde3fbSJed Brown||
166d3fde3fbSJed Brown| **OCCA**                   |
167d3fde3fbSJed Brown| `/*/occa`                  | Selects backend based on available OCCA modes     | Yes                   |
168d3fde3fbSJed Brown| `/cpu/self/occa`           | OCCA backend with serial CPU kernels              | Yes                   |
169d3fde3fbSJed Brown| `/cpu/openmp/occa`         | OCCA backend with OpenMP kernels                  | Yes                   |
1700be03a92SJeremy L Thompson| `/cpu/dpcpp/occa`          | OCCA backend with CPC++ kernels                   | Yes                   |
171d3fde3fbSJed Brown| `/gpu/cuda/occa`           | OCCA backend with CUDA kernels                    | Yes                   |
172d3fde3fbSJed Brown| `/gpu/hip/occa`~           | OCCA backend with HIP kernels                     | Yes                   |
173bcb2dfaeSJed Brown
17417be3a41SJeremy L ThompsonThe `/cpu/self/*/serial` backends process one element at a time and are intended for meshes with a smaller number of high order elements.
17517be3a41SJeremy L ThompsonThe `/cpu/self/*/blocked` backends process blocked batches of eight interlaced elements and are intended for meshes with higher numbers of elements.
176bcb2dfaeSJed Brown
177bcb2dfaeSJed BrownThe `/cpu/self/ref/*` backends are written in pure C and provide basic functionality.
178bcb2dfaeSJed Brown
179bcb2dfaeSJed BrownThe `/cpu/self/opt/*` backends are written in pure C and use partial e-vectors to improve performance.
180bcb2dfaeSJed Brown
181bcb2dfaeSJed BrownThe `/cpu/self/avx/*` backends rely upon AVX instructions to provide vectorized CPU performance.
182bcb2dfaeSJed Brown
18317be3a41SJeremy L ThompsonThe `/cpu/self/memcheck/*` backends rely upon the [Valgrind](http://valgrind.org/) Memcheck tool to help verify that user QFunctions have no undefined values.
18417be3a41SJeremy L ThompsonTo use, run your code with Valgrind and the Memcheck backends, e.g. `valgrind ./build/ex1 -ceed /cpu/self/ref/memcheck`.
18517be3a41SJeremy L ThompsonA 'development' or 'debugging' version of Valgrind with headers is required to use this backend.
18617be3a41SJeremy L ThompsonThis backend can be run in serial or blocked mode and defaults to running in the serial mode if `/cpu/self/memcheck` is selected at runtime.
187bcb2dfaeSJed Brown
18817be3a41SJeremy L ThompsonThe `/cpu/self/xsmm/*` backends rely upon the [LIBXSMM](http://github.com/hfp/libxsmm) package to provide vectorized CPU performance.
18917be3a41SJeremy L ThompsonIf linking MKL and LIBXSMM is desired but the Makefile is not detecting `MKLROOT`, linking libCEED against MKL can be forced by setting the environment variable `MKL=1`.
190bcb2dfaeSJed Brown
191bcb2dfaeSJed BrownThe `/gpu/cuda/*` backends provide GPU performance strictly using CUDA.
192bcb2dfaeSJed Brown
19317be3a41SJeremy L ThompsonThe `/gpu/hip/*` backends provide GPU performance strictly using HIP.
19417be3a41SJeremy L ThompsonThey are based on the `/gpu/cuda/*` backends.
19517be3a41SJeremy L ThompsonROCm version 4.2 or newer is required.
196bcb2dfaeSJed Brown
197bcb2dfaeSJed BrownThe `/gpu/*/magma/*` backends rely upon the [MAGMA](https://bitbucket.org/icl/magma) package.
19817be3a41SJeremy L ThompsonTo enable the MAGMA backends, the environment variable `MAGMA_DIR` must point to the top-level MAGMA directory, with the MAGMA library located in `$(MAGMA_DIR)/lib/`.
19917be3a41SJeremy L ThompsonBy default, `MAGMA_DIR` is set to `../magma`; to build the MAGMA backends with a MAGMA installation located elsewhere, create a link to `magma/` in libCEED's parent directory, or set `MAGMA_DIR` to the proper location.
20017be3a41SJeremy L ThompsonMAGMA version 2.5.0 or newer is required.
20117be3a41SJeremy L ThompsonCurrently, each MAGMA library installation is only built for either CUDA or HIP.
20217be3a41SJeremy L ThompsonThe corresponding set of libCEED backends (`/gpu/cuda/magma/*` or `/gpu/hip/magma/*`) will automatically be built for the version of the MAGMA library found in `MAGMA_DIR`.
203bcb2dfaeSJed Brown
20417be3a41SJeremy L ThompsonUsers can specify a device for all CUDA, HIP, and MAGMA backends through adding `:device_id=#` after the resource name.
20517be3a41SJeremy L ThompsonFor example:
206bcb2dfaeSJed Brown
207bcb2dfaeSJed Brown> - `/gpu/cuda/gen:device_id=1`
208bcb2dfaeSJed Brown
20917be3a41SJeremy L ThompsonThe `/*/occa` backends rely upon the [OCCA](http://github.com/libocca/occa) package to provide cross platform performance.
21017be3a41SJeremy L ThompsonTo enable the OCCA backend, the environment variable `OCCA_DIR` must point to the top-level OCCA directory, with the OCCA library located in the `${OCCA_DIR}/lib` (By default, `OCCA_DIR` is set to `../occa`).
2110be03a92SJeremy L ThompsonOCCA version 1.4.0 or newer is required.
212bcb2dfaeSJed Brown
2130be03a92SJeremy L ThompsonUsers can pass specific OCCA device properties after setting the CEED resource.
214bcb2dfaeSJed BrownFor example:
215bcb2dfaeSJed Brown
216bcb2dfaeSJed Brown> - `"/*/occa:mode='CUDA',device_id=0"`
217bcb2dfaeSJed Brown
218bcb2dfaeSJed BrownBit-for-bit reproducibility is important in some applications.
219bcb2dfaeSJed BrownHowever, some libCEED backends use non-deterministic operations, such as `atomicAdd` for increased performance.
220bcb2dfaeSJed BrownThe backends which are capable of generating reproducible results, with the proper compilation options, are highlighted in the list above.
221bcb2dfaeSJed Brown
222bcb2dfaeSJed Brown## Examples
223bcb2dfaeSJed Brown
22417be3a41SJeremy L ThompsonlibCEED comes with several examples of its usage, ranging from standalone C codes in the `/examples/ceed` directory to examples based on external packages, such as MFEM, PETSc, and Nek5000.
22517be3a41SJeremy L ThompsonNek5000 v18.0 or greater is required.
226bcb2dfaeSJed Brown
22717be3a41SJeremy L ThompsonTo build the examples, set the `MFEM_DIR`, `PETSC_DIR`, and `NEK5K_DIR` variables and run:
228bcb2dfaeSJed Brown
229b648fd31SJed Brown```console
230b648fd31SJed Brown$ cd examples/
231bcb2dfaeSJed Brown```
232bcb2dfaeSJed Brown
233bcb2dfaeSJed Brown% running-examples-inclusion-marker
234bcb2dfaeSJed Brown
235bcb2dfaeSJed Brown```console
236bcb2dfaeSJed Brown# libCEED examples on CPU and GPU
237b648fd31SJed Brown$ cd ceed/
238b648fd31SJed Brown$ make
239b648fd31SJed Brown$ ./ex1-volume -ceed /cpu/self
240b648fd31SJed Brown$ ./ex1-volume -ceed /gpu/cuda
241b648fd31SJed Brown$ ./ex2-surface -ceed /cpu/self
242b648fd31SJed Brown$ ./ex2-surface -ceed /gpu/cuda
243b648fd31SJed Brown$ cd ..
244bcb2dfaeSJed Brown
245bcb2dfaeSJed Brown# MFEM+libCEED examples on CPU and GPU
246b648fd31SJed Brown$ cd mfem/
247b648fd31SJed Brown$ make
248b648fd31SJed Brown$ ./bp1 -ceed /cpu/self -no-vis
249b648fd31SJed Brown$ ./bp3 -ceed /gpu/cuda -no-vis
250b648fd31SJed Brown$ cd ..
251bcb2dfaeSJed Brown
252bcb2dfaeSJed Brown# Nek5000+libCEED examples on CPU and GPU
253b648fd31SJed Brown$ cd nek/
254b648fd31SJed Brown$ make
255b648fd31SJed Brown$ ./nek-examples.sh -e bp1 -ceed /cpu/self -b 3
256b648fd31SJed Brown$ ./nek-examples.sh -e bp3 -ceed /gpu/cuda -b 3
257b648fd31SJed Brown$ cd ..
258bcb2dfaeSJed Brown
259bcb2dfaeSJed Brown# PETSc+libCEED examples on CPU and GPU
260b648fd31SJed Brown$ cd petsc/
261b648fd31SJed Brown$ make
262b648fd31SJed Brown$ ./bps -problem bp1 -ceed /cpu/self
263b648fd31SJed Brown$ ./bps -problem bp2 -ceed /gpu/cuda
264b648fd31SJed Brown$ ./bps -problem bp3 -ceed /cpu/self
265b648fd31SJed Brown$ ./bps -problem bp4 -ceed /gpu/cuda
266b648fd31SJed Brown$ ./bps -problem bp5 -ceed /cpu/self
267b648fd31SJed Brown$ ./bps -problem bp6 -ceed /gpu/cuda
268b648fd31SJed Brown$ cd ..
269bcb2dfaeSJed Brown
270b648fd31SJed Brown$ cd petsc/
271b648fd31SJed Brown$ make
272b648fd31SJed Brown$ ./bpsraw -problem bp1 -ceed /cpu/self
273b648fd31SJed Brown$ ./bpsraw -problem bp2 -ceed /gpu/cuda
274b648fd31SJed Brown$ ./bpsraw -problem bp3 -ceed /cpu/self
275b648fd31SJed Brown$ ./bpsraw -problem bp4 -ceed /gpu/cuda
276b648fd31SJed Brown$ ./bpsraw -problem bp5 -ceed /cpu/self
277b648fd31SJed Brown$ ./bpsraw -problem bp6 -ceed /gpu/cuda
278b648fd31SJed Brown$ cd ..
279bcb2dfaeSJed Brown
280b648fd31SJed Brown$ cd petsc/
281b648fd31SJed Brown$ make
282b648fd31SJed Brown$ ./bpssphere -problem bp1 -ceed /cpu/self
283b648fd31SJed Brown$ ./bpssphere -problem bp2 -ceed /gpu/cuda
284b648fd31SJed Brown$ ./bpssphere -problem bp3 -ceed /cpu/self
285b648fd31SJed Brown$ ./bpssphere -problem bp4 -ceed /gpu/cuda
286b648fd31SJed Brown$ ./bpssphere -problem bp5 -ceed /cpu/self
287b648fd31SJed Brown$ ./bpssphere -problem bp6 -ceed /gpu/cuda
288b648fd31SJed Brown$ cd ..
289bcb2dfaeSJed Brown
290b648fd31SJed Brown$ cd petsc/
291b648fd31SJed Brown$ make
292b648fd31SJed Brown$ ./area -problem cube -ceed /cpu/self -degree 3
293b648fd31SJed Brown$ ./area -problem cube -ceed /gpu/cuda -degree 3
294b648fd31SJed Brown$ ./area -problem sphere -ceed /cpu/self -degree 3 -dm_refine 2
295b648fd31SJed Brown$ ./area -problem sphere -ceed /gpu/cuda -degree 3 -dm_refine 2
296bcb2dfaeSJed Brown
297b648fd31SJed Brown$ cd fluids/
298b648fd31SJed Brown$ make
299b648fd31SJed Brown$ ./navierstokes -ceed /cpu/self -degree 1
300b648fd31SJed Brown$ ./navierstokes -ceed /gpu/cuda -degree 1
301b648fd31SJed Brown$ cd ..
302bcb2dfaeSJed Brown
303b648fd31SJed Brown$ cd solids/
304b648fd31SJed Brown$ make
305b648fd31SJed Brown$ ./elasticity -ceed /cpu/self -mesh [.exo file] -degree 2 -E 1 -nu 0.3 -problem Linear -forcing mms
306b648fd31SJed Brown$ ./elasticity -ceed /gpu/cuda -mesh [.exo file] -degree 2 -E 1 -nu 0.3 -problem Linear -forcing mms
307b648fd31SJed Brown$ cd ..
308bcb2dfaeSJed Brown```
309bcb2dfaeSJed Brown
31017be3a41SJeremy L ThompsonFor the last example shown, sample meshes to be used in place of `[.exo file]` can be found at <https://github.com/jeremylt/ceedSampleMeshes>
311bcb2dfaeSJed Brown
31217be3a41SJeremy L ThompsonThe above code assumes a GPU-capable machine with the CUDA backends enabled.
31317be3a41SJeremy L ThompsonDepending on the available backends, other CEED resource specifiers can be provided with the `-ceed` option.
31417be3a41SJeremy L ThompsonOther command line arguments can be found in [examples/petsc](https://github.com/CEED/libCEED/blob/main/examples/petsc/README.md).
315bcb2dfaeSJed Brown
316bcb2dfaeSJed Brown% benchmarks-marker
317bcb2dfaeSJed Brown
318bcb2dfaeSJed Brown## Benchmarks
319bcb2dfaeSJed Brown
320bcb2dfaeSJed BrownA sequence of benchmarks for all enabled backends can be run using:
321bcb2dfaeSJed Brown
322b648fd31SJed Brown```console
323b648fd31SJed Brown$ make benchmarks
324bcb2dfaeSJed Brown```
325bcb2dfaeSJed Brown
32617be3a41SJeremy L ThompsonThe results from the benchmarks are stored inside the `benchmarks/` directory and they can be viewed using the commands (requires python with matplotlib):
327bcb2dfaeSJed Brown
328b648fd31SJed Brown```console
329b648fd31SJed Brown$ cd benchmarks
330b648fd31SJed Brown$ python postprocess-plot.py petsc-bps-bp1-*-output.txt
331b648fd31SJed Brown$ python postprocess-plot.py petsc-bps-bp3-*-output.txt
332bcb2dfaeSJed Brown```
333bcb2dfaeSJed Brown
33417be3a41SJeremy L ThompsonUsing the `benchmarks` target runs a comprehensive set of benchmarks which may take some time to run.
33517be3a41SJeremy L ThompsonSubsets of the benchmarks can be run using the scripts in the `benchmarks` folder.
336bcb2dfaeSJed Brown
337bcb2dfaeSJed BrownFor more details about the benchmarks, see the `benchmarks/README.md` file.
338bcb2dfaeSJed Brown
339bcb2dfaeSJed Brown## Install
340bcb2dfaeSJed Brown
341bcb2dfaeSJed BrownTo install libCEED, run:
342bcb2dfaeSJed Brown
343b648fd31SJed Brown```console
344b648fd31SJed Brown$ make install prefix=/path/to/install/dir
345bcb2dfaeSJed Brown```
346bcb2dfaeSJed Brown
347bcb2dfaeSJed Brownor (e.g., if creating packages):
348bcb2dfaeSJed Brown
349b648fd31SJed Brown```console
350b648fd31SJed Brown$ make install prefix=/usr DESTDIR=/packaging/path
351bcb2dfaeSJed Brown```
352bcb2dfaeSJed Brown
353d27ed4f3SJeremy L ThompsonTo build and install in separate steps, run:
354d27ed4f3SJeremy L Thompson
355b648fd31SJed Brown```console
356b648fd31SJed Brown$ make for_install=1 prefix=/path/to/install/dir
357b648fd31SJed Brown$ make install prefix=/path/to/install/dir
358d27ed4f3SJeremy L Thompson```
359d27ed4f3SJeremy L Thompson
36017be3a41SJeremy L ThompsonThe usual variables like `CC` and `CFLAGS` are used, and optimization flags for all languages can be set using the likes of `OPT='-O3 -march=native'`.
36117be3a41SJeremy L ThompsonUse `STATIC=1` to build static libraries (`libceed.a`).
362bcb2dfaeSJed Brown
363bcb2dfaeSJed BrownTo install libCEED for Python, run:
364bcb2dfaeSJed Brown
365b648fd31SJed Brown```console
366b648fd31SJed Brown$ pip install libceed
367bcb2dfaeSJed Brown```
368bcb2dfaeSJed Brown
369bcb2dfaeSJed Brownwith the desired setuptools options, such as `--user`.
370bcb2dfaeSJed Brown
371bcb2dfaeSJed Brown### pkg-config
372bcb2dfaeSJed Brown
37317be3a41SJeremy L ThompsonIn addition to library and header, libCEED provides a [pkg-config](https://en.wikipedia.org/wiki/Pkg-config) file that can be used to easily compile and link.
37417be3a41SJeremy L Thompson[For example](https://people.freedesktop.org/~dbn/pkg-config-guide.html#faq), if `$prefix` is a standard location or you set the environment variable `PKG_CONFIG_PATH`:
375bcb2dfaeSJed Brown
376b648fd31SJed Brown```console
377b648fd31SJed Brown$ cc `pkg-config --cflags --libs ceed` -o myapp myapp.c
378bcb2dfaeSJed Brown```
379bcb2dfaeSJed Brown
38017be3a41SJeremy L Thompsonwill build `myapp` with libCEED.
38117be3a41SJeremy L ThompsonThis can be used with the source or installed directories.
38217be3a41SJeremy L ThompsonMost build systems have support for pkg-config.
383bcb2dfaeSJed Brown
384bcb2dfaeSJed Brown## Contact
385bcb2dfaeSJed Brown
38617be3a41SJeremy L ThompsonYou can reach the libCEED team by emailing [ceed-users@llnl.gov](mailto:ceed-users@llnl.gov) or by leaving a comment in the [issue tracker](https://github.com/CEED/libCEED/issues).
387bcb2dfaeSJed Brown
388bcb2dfaeSJed Brown## How to Cite
389bcb2dfaeSJed Brown
390bcb2dfaeSJed BrownIf you utilize libCEED please cite:
391bcb2dfaeSJed Brown
392b648fd31SJed Brown```bibtex
393bcb2dfaeSJed Brown@article{libceed-joss-paper,
394bcb2dfaeSJed Brown  author       = {Jed Brown and Ahmad Abdelfattah and Valeria Barra and Natalie Beams and Jean Sylvain Camier and Veselin Dobrev and Yohann Dudouit and Leila Ghaffari and Tzanio Kolev and David Medina and Will Pazner and Thilina Ratnayaka and Jeremy Thompson and Stan Tomov},
395bcb2dfaeSJed Brown  title        = {{libCEED}: Fast algebra for high-order element-based discretizations},
396bcb2dfaeSJed Brown  journal      = {Journal of Open Source Software},
397bcb2dfaeSJed Brown  year         = {2021},
398bcb2dfaeSJed Brown  publisher    = {The Open Journal},
399bcb2dfaeSJed Brown  volume       = {6},
400bcb2dfaeSJed Brown  number       = {63},
401bcb2dfaeSJed Brown  pages        = {2945},
402bcb2dfaeSJed Brown  doi          = {10.21105/joss.02945}
403bcb2dfaeSJed Brown}
404*87a4ead5SJeremy L Thompson```
405bcb2dfaeSJed Brown
406*87a4ead5SJeremy L ThompsonThe archival copy of the libCEED user manual is maintained on [Zenodo](https://doi.org/10.5281/zenodo.4302736).
407*87a4ead5SJeremy L ThompsonTo cite the user manual:
408*87a4ead5SJeremy L Thompson
409*87a4ead5SJeremy L Thompson```bibtex
410bcb2dfaeSJed Brown@misc{libceed-user-manual,
411bcb2dfaeSJed Brown  author       = {Abdelfattah, Ahmad and
412bcb2dfaeSJed Brown                  Barra, Valeria and
413bcb2dfaeSJed Brown                  Beams, Natalie and
414bcb2dfaeSJed Brown                  Brown, Jed and
415bcb2dfaeSJed Brown                  Camier, Jean-Sylvain and
416bcb2dfaeSJed Brown                  Dobrev, Veselin and
417bcb2dfaeSJed Brown                  Dudouit, Yohann and
418bcb2dfaeSJed Brown                  Ghaffari, Leila and
419bcb2dfaeSJed Brown                  Kolev, Tzanio and
420bcb2dfaeSJed Brown                  Medina, David and
421bcb2dfaeSJed Brown                  Pazner, Will and
422bcb2dfaeSJed Brown                  Ratnayaka, Thilina and
423a85b61d6SJeremy L Thompson                  Shakeri, Rezgar and
424bcb2dfaeSJed Brown                  Thompson, Jeremy L and
425a85b61d6SJeremy L Thompson                  Tomov, Stanimire and
426a85b61d6SJeremy L Thompson                  Wright III, James},
427bcb2dfaeSJed Brown  title        = {{libCEED} User Manual},
428a85b61d6SJeremy L Thompson  month        = dec,
429a85b61d6SJeremy L Thompson  year         = 2022,
430bcb2dfaeSJed Brown  publisher    = {Zenodo},
431a85b61d6SJeremy L Thompson  version      = {0.11.0},
432a85b61d6SJeremy L Thompson  doi          = {10.5281/zenodo.7480454}
433bcb2dfaeSJed Brown}
434bcb2dfaeSJed Brown```
435bcb2dfaeSJed Brown
436bcb2dfaeSJed BrownFor libCEED's Python interface please cite:
437bcb2dfaeSJed Brown
438b648fd31SJed Brown```bibtex
439bcb2dfaeSJed Brown@InProceedings{libceed-paper-proc-scipy-2020,
440bcb2dfaeSJed Brown  author    = {{V}aleria {B}arra and {J}ed {B}rown and {J}eremy {T}hompson and {Y}ohann {D}udouit},
441bcb2dfaeSJed Brown  title     = {{H}igh-performance operator evaluations with ease of use: lib{C}{E}{E}{D}'s {P}ython interface},
442bcb2dfaeSJed Brown  booktitle = {{P}roceedings of the 19th {P}ython in {S}cience {C}onference},
443bcb2dfaeSJed Brown  pages     = {85 - 90},
444bcb2dfaeSJed Brown  year      = {2020},
445bcb2dfaeSJed Brown  editor    = {{M}eghann {A}garwal and {C}hris {C}alloway and {D}illon {N}iederhut and {D}avid {S}hupe},
446bcb2dfaeSJed Brown  doi       = {10.25080/Majora-342d178e-00c}
447bcb2dfaeSJed Brown}
448bcb2dfaeSJed Brown```
449bcb2dfaeSJed Brown
450b648fd31SJed BrownThe BibTeX entries for these references can be found in the `doc/bib/references.bib` file.
451bcb2dfaeSJed Brown
452bcb2dfaeSJed Brown## Copyright
453bcb2dfaeSJed Brown
45417be3a41SJeremy L ThompsonThe following copyright applies to each file in the CEED software suite, unless otherwise stated in the file:
455bcb2dfaeSJed Brown
456bcb2dfaeSJed Brown> Copyright (c) 2017, Lawrence Livermore National Security, LLC. Produced at the
457bcb2dfaeSJed Brown> Lawrence Livermore National Laboratory. LLNL-CODE-734707. All Rights reserved.
458bcb2dfaeSJed Brown
459bcb2dfaeSJed BrownSee files LICENSE and NOTICE for details.
460d3fde3fbSJed Brown
461d3fde3fbSJed Brown[github-badge]: https://github.com/CEED/libCEED/workflows/C/Fortran/badge.svg
462d3fde3fbSJed Brown[github-link]: https://github.com/CEED/libCEED/actions
463d3fde3fbSJed Brown[gitlab-badge]: https://gitlab.com/libceed/libCEED/badges/main/pipeline.svg?key_text=GitLab-CI
464d3fde3fbSJed Brown[gitlab-link]: https://gitlab.com/libceed/libCEED/-/pipelines?page=1&scope=all&ref=main
465d3fde3fbSJed Brown[codecov-badge]: https://codecov.io/gh/CEED/libCEED/branch/main/graphs/badge.svg
466d3fde3fbSJed Brown[codecov-link]: https://codecov.io/gh/CEED/libCEED/
467d3fde3fbSJed Brown[license-badge]: https://img.shields.io/badge/License-BSD%202--Clause-orange.svg
468d3fde3fbSJed Brown[license-link]: https://opensource.org/licenses/BSD-2-Clause
469d3fde3fbSJed Brown[doc-badge]: https://readthedocs.org/projects/libceed/badge/?version=latest
47013964f07SJed Brown[doc-link]: https://libceed.org/en/latest/?badge=latest
471d3fde3fbSJed Brown[joss-badge]: https://joss.theoj.org/papers/10.21105/joss.02945/status.svg
472d3fde3fbSJed Brown[joss-link]: https://doi.org/10.21105/joss.02945
473d3fde3fbSJed Brown[binder-badge]: http://mybinder.org/badge_logo.svg
4741bd2483cSJeremy L Thompson[binder-link]: https://mybinder.org/v2/gh/CEED/libCEED/main?urlpath=lab/tree/examples/python/tutorial-0-ceed.ipynb
475