xref: /libCEED/README.md (revision 334b1dcfef601403a7980e03f226e013098bd6c0)
1bcb2dfaeSJed Brown# libCEED: Efficient Extensible Discretization
2bcb2dfaeSJed Brown
3d3fde3fbSJed Brown[![GitHub Actions][github-badge]][github-link]
4d3fde3fbSJed Brown[![GitLab-CI][gitlab-badge]][gitlab-link]
5d3fde3fbSJed Brown[![Code coverage][codecov-badge]][codecov-link]
6d3fde3fbSJed Brown[![BSD-2-Clause][license-badge]][license-link]
7d3fde3fbSJed Brown[![Documentation][doc-badge]][doc-link]
8d3fde3fbSJed Brown[![JOSS paper][joss-badge]][joss-link]
9d3fde3fbSJed Brown[![Binder][binder-badge]][binder-link]
10bcb2dfaeSJed Brown
11bcb2dfaeSJed Brown## Summary and Purpose
12bcb2dfaeSJed Brown
1317be3a41SJeremy L ThompsonlibCEED provides fast algebra for element-based discretizations, designed for performance portability, run-time flexibility, and clean embedding in higher level libraries and applications.
1417be3a41SJeremy L ThompsonIt offers a C99 interface as well as bindings for Fortran, Python, Julia, and Rust.
1517be3a41SJeremy L ThompsonWhile our focus is on high-order finite elements, the approach is mostly algebraic and thus applicable to other discretizations in factored form, as explained in the [user manual](https://libceed.org/en/latest/) and API implementation portion of the [documentation](https://libceed.org/en/latest/api/).
16bcb2dfaeSJed Brown
1717be3a41SJeremy L ThompsonOne of the challenges with high-order methods is that a global sparse matrix is no longer a good representation of a high-order linear operator, both with respect to the FLOPs needed for its evaluation, as well as the memory transfer needed for a matvec.
1817be3a41SJeremy L ThompsonThus, high-order methods require a new "format" that still represents a linear (or more generally non-linear) operator, but not through a sparse matrix.
19bcb2dfaeSJed Brown
2017be3a41SJeremy L ThompsonThe goal of libCEED is to propose such a format, as well as supporting implementations and data structures, that enable efficient operator evaluation on a variety of computational device types (CPUs, GPUs, etc.).
2117be3a41SJeremy L ThompsonThis new operator description is based on algebraically [factored form](https://libceed.org/en/latest/libCEEDapi/#finite-element-operator-decomposition), which is easy to incorporate in a wide variety of applications, without significant refactoring of their own discretization infrastructure.
22bcb2dfaeSJed Brown
2317be3a41SJeremy L ThompsonThe repository is part of the [CEED software suite](http://ceed.exascaleproject.org/software/), a collection of software benchmarks, miniapps, libraries and APIs for efficient exascale discretizations based on high-order finite element and spectral element methods.
24bcb2dfaeSJed BrownSee <http://github.com/ceed> for more information and source code availability.
25bcb2dfaeSJed Brown
2617be3a41SJeremy L ThompsonThe CEED research is supported by the [Exascale Computing Project](https://exascaleproject.org/exascale-computing-project) (17-SC-20-SC), a collaborative effort of two U.S. Department of Energy organizations (Office of Science and the National Nuclear Security Administration) responsible for the planning and preparation of a [capable exascale ecosystem](https://exascaleproject.org/what-is-exascale), including software, applications, hardware, advanced system engineering and early testbed platforms, in support of the nation’s exascale computing imperative.
27bcb2dfaeSJed Brown
2813964f07SJed BrownFor more details on the CEED API see the [user manual](https://libceed.org/en/latest/).
29bcb2dfaeSJed Brown
30525f58efSJeremy L Thompson<!-- getting-started-inclusion -->
31bcb2dfaeSJed Brown
32bcb2dfaeSJed Brown## Building
33bcb2dfaeSJed Brown
3417be3a41SJeremy L ThompsonThe CEED library, `libceed`, is a C99 library with no required dependencies, and with Fortran, Python, Julia, and Rust interfaces.
3517be3a41SJeremy L ThompsonIt can be built using:
36bcb2dfaeSJed Brown
37b648fd31SJed Brown```console
38b648fd31SJed Brown$ make
39bcb2dfaeSJed Brown```
40bcb2dfaeSJed Brown
41bcb2dfaeSJed Brownor, with optimization flags:
42bcb2dfaeSJed Brown
43b648fd31SJed Brown```console
44b648fd31SJed Brown$ make OPT='-O3 -march=skylake-avx512 -ffp-contract=fast'
45bcb2dfaeSJed Brown```
46bcb2dfaeSJed Brown
4717be3a41SJeremy L ThompsonThese optimization flags are used by all languages (C, C++, Fortran) and this makefile variable can also be set for testing and examples (below).
48bcb2dfaeSJed Brown
4917be3a41SJeremy L ThompsonThe library attempts to automatically detect support for the AVX instruction set using gcc-style compiler options for the host.
50bcb2dfaeSJed BrownSupport may need to be manually specified via:
51bcb2dfaeSJed Brown
52b648fd31SJed Brown```console
53b648fd31SJed Brown$ make AVX=1
54bcb2dfaeSJed Brown```
55bcb2dfaeSJed Brown
56bcb2dfaeSJed Brownor:
57bcb2dfaeSJed Brown
58b648fd31SJed Brown```console
59b648fd31SJed Brown$ make AVX=0
60bcb2dfaeSJed Brown```
61bcb2dfaeSJed Brown
6217be3a41SJeremy L Thompsonif your compiler does not support gcc-style options, if you are cross compiling, etc.
63bcb2dfaeSJed Brown
6417be3a41SJeremy L ThompsonTo enable CUDA support, add `CUDA_DIR=/opt/cuda` or an appropriate directory to your `make` invocation.
65023b8a51Sabdelfattah83To enable HIP support, add `ROCM_DIR=/opt/rocm` or an appropriate directory.
66bd882c8aSJames WrightTo enable SYCL support, add `SYCL_DIR=/opt/sycl` or an appropriate directory.
67bd882c8aSJames WrightNote that SYCL backends require building with oneAPI compilers as well:
6858c07c4fSSebastian Grimberg
69bd882c8aSJames Wright```console
70bd882c8aSJames Wright$ . /opt/intel/oneapi/setvars.sh
71bd882c8aSJames Wright$ make SYCL_DIR=/opt/intel/oneapi/compiler/latest/linux SYCLCXX=icpx CC=icx CXX=icpx
72bd882c8aSJames Wright```
73bd882c8aSJames Wright
7458c07c4fSSebastian GrimbergThe library can be configured for host applications which use OpenMP paralellism via:
7558c07c4fSSebastian Grimberg
7658c07c4fSSebastian Grimberg```console
7758c07c4fSSebastian Grimberg$ make OPENMP=1
7858c07c4fSSebastian Grimberg```
7958c07c4fSSebastian Grimberg
8058c07c4fSSebastian Grimbergwhich will allow operators created and applied from different threads inside an `omp parallel` region.
8158c07c4fSSebastian Grimberg
8217be3a41SJeremy L ThompsonTo store these or other arguments as defaults for future invocations of `make`, use:
83bcb2dfaeSJed Brown
84b648fd31SJed Brown```console
85023b8a51Sabdelfattah83$ make configure CUDA_DIR=/usr/local/cuda ROCM_DIR=/opt/rocm OPT='-O3 -march=znver2'
86bcb2dfaeSJed Brown```
87bcb2dfaeSJed Brown
88bcb2dfaeSJed Brownwhich stores these variables in `config.mk`.
89bcb2dfaeSJed Brown
90b648fd31SJed Brown### WebAssembly
91b648fd31SJed Brown
92b648fd31SJed BrownlibCEED can be built for WASM using [Emscripten](https://emscripten.org). For example, one can build the library and run a standalone WASM executable using
93b648fd31SJed Brown
94b648fd31SJed Brown``` console
95b648fd31SJed Brown$ emmake make build/ex2-surface.wasm
96b648fd31SJed Brown$ wasmer build/ex2-surface.wasm -- -s 200000
97b648fd31SJed Brown```
98b648fd31SJed Brown
99bcb2dfaeSJed Brown## Additional Language Interfaces
100bcb2dfaeSJed Brown
101bcb2dfaeSJed BrownThe Fortran interface is built alongside the library automatically.
102bcb2dfaeSJed Brown
103bcb2dfaeSJed BrownPython users can install using:
104bcb2dfaeSJed Brown
105b648fd31SJed Brown```console
106b648fd31SJed Brown$ pip install libceed
107bcb2dfaeSJed Brown```
108bcb2dfaeSJed Brown
109bcb2dfaeSJed Brownor in a clone of the repository via `pip install .`.
110bcb2dfaeSJed Brown
111bcb2dfaeSJed BrownJulia users can install using:
112bcb2dfaeSJed Brown
113b648fd31SJed Brown```console
114bcb2dfaeSJed Brown$ julia
115bcb2dfaeSJed Brownjulia> ]
116bcb2dfaeSJed Brownpkg> add LibCEED
117bcb2dfaeSJed Brown```
118bcb2dfaeSJed Brown
11917be3a41SJeremy L ThompsonSee the [LibCEED.jl documentation](http://ceed.exascaleproject.org/libCEED-julia-docs/dev/) for more information.
120bcb2dfaeSJed Brown
121bcb2dfaeSJed BrownRust users can include libCEED via `Cargo.toml`:
122bcb2dfaeSJed Brown
123bcb2dfaeSJed Brown```toml
124bcb2dfaeSJed Brown[dependencies]
1254018a20aSJeremy L Thompsonlibceed = "0.12.0"
126bcb2dfaeSJed Brown```
127bcb2dfaeSJed Brown
128bcb2dfaeSJed BrownSee the [Cargo documentation](https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html#specifying-dependencies-from-git-repositories) for details.
129bcb2dfaeSJed Brown
130bcb2dfaeSJed Brown## Testing
131bcb2dfaeSJed Brown
132bcb2dfaeSJed BrownThe test suite produces [TAP](https://testanything.org) output and is run by:
133bcb2dfaeSJed Brown
134f11332b8SJeremy L Thompson```console
135f11332b8SJeremy L Thompson$ make test
136bcb2dfaeSJed Brown```
137bcb2dfaeSJed Brown
138bcb2dfaeSJed Brownor, using the `prove` tool distributed with Perl (recommended):
139bcb2dfaeSJed Brown
140f11332b8SJeremy L Thompson```console
141f11332b8SJeremy L Thompson$ make prove
142bcb2dfaeSJed Brown```
143bcb2dfaeSJed Brown
144bcb2dfaeSJed Brown## Backends
145bcb2dfaeSJed Brown
146bcb2dfaeSJed BrownThere are multiple supported backends, which can be selected at runtime in the examples:
147bcb2dfaeSJed Brown
148bcb2dfaeSJed Brown| CEED resource              | Backend                                           | Deterministic Capable |
149d3fde3fbSJed Brown| :---                       | :---                                              | :---:                 |
150d3fde3fbSJed Brown||
151d3fde3fbSJed Brown| **CPU Native**             |
152d3fde3fbSJed Brown| `/cpu/self/ref/serial`     | Serial reference implementation                   | Yes                   |
153d3fde3fbSJed Brown| `/cpu/self/ref/blocked`    | Blocked reference implementation                  | Yes                   |
154d3fde3fbSJed Brown| `/cpu/self/opt/serial`     | Serial optimized C implementation                 | Yes                   |
155d3fde3fbSJed Brown| `/cpu/self/opt/blocked`    | Blocked optimized C implementation                | Yes                   |
156d3fde3fbSJed Brown| `/cpu/self/avx/serial`     | Serial AVX implementation                         | Yes                   |
157d3fde3fbSJed Brown| `/cpu/self/avx/blocked`    | Blocked AVX implementation                        | Yes                   |
158d3fde3fbSJed Brown||
159d3fde3fbSJed Brown| **CPU Valgrind**           |
160d3fde3fbSJed Brown| `/cpu/self/memcheck/*`     | Memcheck backends, undefined value checks         | Yes                   |
161d3fde3fbSJed Brown||
162d3fde3fbSJed Brown| **CPU LIBXSMM**            |
163d3fde3fbSJed Brown| `/cpu/self/xsmm/serial`    | Serial LIBXSMM implementation                     | Yes                   |
164d3fde3fbSJed Brown| `/cpu/self/xsmm/blocked`   | Blocked LIBXSMM implementation                    | Yes                   |
165d3fde3fbSJed Brown||
166d3fde3fbSJed Brown| **CUDA Native**            |
167d3fde3fbSJed Brown| `/gpu/cuda/ref`            | Reference pure CUDA kernels                       | Yes                   |
168d3fde3fbSJed Brown| `/gpu/cuda/shared`         | Optimized pure CUDA kernels using shared memory   | Yes                   |
169d3fde3fbSJed Brown| `/gpu/cuda/gen`            | Optimized pure CUDA kernels using code generation | No                    |
170d3fde3fbSJed Brown||
171d3fde3fbSJed Brown| **HIP Native**             |
172d3fde3fbSJed Brown| `/gpu/hip/ref`             | Reference pure HIP kernels                        | Yes                   |
173d3fde3fbSJed Brown| `/gpu/hip/shared`          | Optimized pure HIP kernels using shared memory    | Yes                   |
174d3fde3fbSJed Brown| `/gpu/hip/gen`             | Optimized pure HIP kernels using code generation  | No                    |
175d3fde3fbSJed Brown||
176bd882c8aSJames Wright| **SYCL Native**            |
177bd882c8aSJames Wright| `/gpu/sycl/ref`            | Reference pure SYCL kernels                       | Yes                   |
178bd882c8aSJames Wright| `/gpu/sycl/shared`         | Optimized pure SYCL kernels using shared memory   | Yes                   |
179bd882c8aSJames Wright||
180d3fde3fbSJed Brown| **MAGMA**                  |
181d3fde3fbSJed Brown| `/gpu/cuda/magma`          | CUDA MAGMA kernels                                | No                    |
182d3fde3fbSJed Brown| `/gpu/cuda/magma/det`      | CUDA MAGMA kernels                                | Yes                   |
183d3fde3fbSJed Brown| `/gpu/hip/magma`           | HIP MAGMA kernels                                 | No                    |
184d3fde3fbSJed Brown| `/gpu/hip/magma/det`       | HIP MAGMA kernels                                 | Yes                   |
185d3fde3fbSJed Brown||
186bcb2dfaeSJed Brown
18717be3a41SJeremy L ThompsonThe `/cpu/self/*/serial` backends process one element at a time and are intended for meshes with a smaller number of high order elements.
18817be3a41SJeremy L ThompsonThe `/cpu/self/*/blocked` backends process blocked batches of eight interlaced elements and are intended for meshes with higher numbers of elements.
189bcb2dfaeSJed Brown
190bcb2dfaeSJed BrownThe `/cpu/self/ref/*` backends are written in pure C and provide basic functionality.
191bcb2dfaeSJed Brown
192bcb2dfaeSJed BrownThe `/cpu/self/opt/*` backends are written in pure C and use partial e-vectors to improve performance.
193bcb2dfaeSJed Brown
194bcb2dfaeSJed BrownThe `/cpu/self/avx/*` backends rely upon AVX instructions to provide vectorized CPU performance.
195bcb2dfaeSJed Brown
196f0b1efffSJeremy L ThompsonThe `/cpu/self/memcheck/*` backends rely upon the [Valgrind](https://valgrind.org/) Memcheck tool to help verify that user QFunctions have no undefined values.
19717be3a41SJeremy L ThompsonTo use, run your code with Valgrind and the Memcheck backends, e.g. `valgrind ./build/ex1 -ceed /cpu/self/ref/memcheck`.
19817be3a41SJeremy L ThompsonA 'development' or 'debugging' version of Valgrind with headers is required to use this backend.
19917be3a41SJeremy L ThompsonThis backend can be run in serial or blocked mode and defaults to running in the serial mode if `/cpu/self/memcheck` is selected at runtime.
200bcb2dfaeSJed Brown
201f0b1efffSJeremy L ThompsonThe `/cpu/self/xsmm/*` backends rely upon the [LIBXSMM](https://github.com/libxsmm/libxsmm) package to provide vectorized CPU performance.
20217be3a41SJeremy L ThompsonIf linking MKL and LIBXSMM is desired but the Makefile is not detecting `MKLROOT`, linking libCEED against MKL can be forced by setting the environment variable `MKL=1`.
203ba0bd193SJeremy L ThompsonThe LIBXSMM `main` development branch from 7 April 2024 or newer is required.
204bcb2dfaeSJed Brown
205bcb2dfaeSJed BrownThe `/gpu/cuda/*` backends provide GPU performance strictly using CUDA.
206bcb2dfaeSJed Brown
20717be3a41SJeremy L ThompsonThe `/gpu/hip/*` backends provide GPU performance strictly using HIP.
20817be3a41SJeremy L ThompsonThey are based on the `/gpu/cuda/*` backends.
20917be3a41SJeremy L ThompsonROCm version 4.2 or newer is required.
210bcb2dfaeSJed Brown
211bd882c8aSJames WrightThe `/gpu/sycl/*` backends provide GPU performance strictly using SYCL.
212bd882c8aSJames WrightThey are based on the `/gpu/cuda/*` and `/gpu/hip/*` backends.
213bd882c8aSJames Wright
214bcb2dfaeSJed BrownThe `/gpu/*/magma/*` backends rely upon the [MAGMA](https://bitbucket.org/icl/magma) package.
21517be3a41SJeremy L ThompsonTo enable the MAGMA backends, the environment variable `MAGMA_DIR` must point to the top-level MAGMA directory, with the MAGMA library located in `$(MAGMA_DIR)/lib/`.
21617be3a41SJeremy L ThompsonBy default, `MAGMA_DIR` is set to `../magma`; to build the MAGMA backends with a MAGMA installation located elsewhere, create a link to `magma/` in libCEED's parent directory, or set `MAGMA_DIR` to the proper location.
21717be3a41SJeremy L ThompsonMAGMA version 2.5.0 or newer is required.
21817be3a41SJeremy L ThompsonCurrently, each MAGMA library installation is only built for either CUDA or HIP.
21917be3a41SJeremy L ThompsonThe corresponding set of libCEED backends (`/gpu/cuda/magma/*` or `/gpu/hip/magma/*`) will automatically be built for the version of the MAGMA library found in `MAGMA_DIR`.
220bcb2dfaeSJed Brown
22117be3a41SJeremy L ThompsonUsers can specify a device for all CUDA, HIP, and MAGMA backends through adding `:device_id=#` after the resource name.
22217be3a41SJeremy L ThompsonFor example:
223bcb2dfaeSJed Brown
224bcb2dfaeSJed Brown> - `/gpu/cuda/gen:device_id=1`
225bcb2dfaeSJed Brown
226bcb2dfaeSJed BrownBit-for-bit reproducibility is important in some applications.
227bcb2dfaeSJed BrownHowever, some libCEED backends use non-deterministic operations, such as `atomicAdd` for increased performance.
228bcb2dfaeSJed BrownThe backends which are capable of generating reproducible results, with the proper compilation options, are highlighted in the list above.
229bcb2dfaeSJed Brown
230525f58efSJeremy L Thompson<!-- getting-started-exclusion -->
231525f58efSJeremy L Thompson
232bcb2dfaeSJed Brown## Examples
233bcb2dfaeSJed Brown
23417be3a41SJeremy L ThompsonlibCEED comes with several examples of its usage, ranging from standalone C codes in the `/examples/ceed` directory to examples based on external packages, such as MFEM, PETSc, and Nek5000.
23517be3a41SJeremy L ThompsonNek5000 v18.0 or greater is required.
236bcb2dfaeSJed Brown
2371809c5f7SJames WrightTo build the examples, set the `MFEM_DIR`, `PETSC_DIR` (and optionally `PETSC_ARCH`), and `NEK5K_DIR` variables and run:
238bcb2dfaeSJed Brown
239b648fd31SJed Brown```console
240b648fd31SJed Brown$ cd examples/
241bcb2dfaeSJed Brown```
242bcb2dfaeSJed Brown
243525f58efSJeremy L Thompson<!-- running-examples-inclusion -->
244bcb2dfaeSJed Brown
245bcb2dfaeSJed Brown```console
246bcb2dfaeSJed Brown# libCEED examples on CPU and GPU
247b648fd31SJed Brown$ cd ceed/
248b648fd31SJed Brown$ make
249b648fd31SJed Brown$ ./ex1-volume -ceed /cpu/self
250b648fd31SJed Brown$ ./ex1-volume -ceed /gpu/cuda
251b648fd31SJed Brown$ ./ex2-surface -ceed /cpu/self
252b648fd31SJed Brown$ ./ex2-surface -ceed /gpu/cuda
253b648fd31SJed Brown$ cd ..
254bcb2dfaeSJed Brown
255bcb2dfaeSJed Brown# MFEM+libCEED examples on CPU and GPU
256b648fd31SJed Brown$ cd mfem/
257b648fd31SJed Brown$ make
258b648fd31SJed Brown$ ./bp1 -ceed /cpu/self -no-vis
259b648fd31SJed Brown$ ./bp3 -ceed /gpu/cuda -no-vis
260b648fd31SJed Brown$ cd ..
261bcb2dfaeSJed Brown
262bcb2dfaeSJed Brown# Nek5000+libCEED examples on CPU and GPU
263b648fd31SJed Brown$ cd nek/
264b648fd31SJed Brown$ make
265b648fd31SJed Brown$ ./nek-examples.sh -e bp1 -ceed /cpu/self -b 3
266b648fd31SJed Brown$ ./nek-examples.sh -e bp3 -ceed /gpu/cuda -b 3
267b648fd31SJed Brown$ cd ..
268bcb2dfaeSJed Brown
269bcb2dfaeSJed Brown# PETSc+libCEED examples on CPU and GPU
270b648fd31SJed Brown$ cd petsc/
271b648fd31SJed Brown$ make
272b648fd31SJed Brown$ ./bps -problem bp1 -ceed /cpu/self
273b648fd31SJed Brown$ ./bps -problem bp2 -ceed /gpu/cuda
274b648fd31SJed Brown$ ./bps -problem bp3 -ceed /cpu/self
275b648fd31SJed Brown$ ./bps -problem bp4 -ceed /gpu/cuda
276b648fd31SJed Brown$ ./bps -problem bp5 -ceed /cpu/self
277b648fd31SJed Brown$ ./bps -problem bp6 -ceed /gpu/cuda
278b648fd31SJed Brown$ cd ..
279bcb2dfaeSJed Brown
280b648fd31SJed Brown$ cd petsc/
281b648fd31SJed Brown$ make
282b648fd31SJed Brown$ ./bpsraw -problem bp1 -ceed /cpu/self
283b648fd31SJed Brown$ ./bpsraw -problem bp2 -ceed /gpu/cuda
284b648fd31SJed Brown$ ./bpsraw -problem bp3 -ceed /cpu/self
285b648fd31SJed Brown$ ./bpsraw -problem bp4 -ceed /gpu/cuda
286b648fd31SJed Brown$ ./bpsraw -problem bp5 -ceed /cpu/self
287b648fd31SJed Brown$ ./bpsraw -problem bp6 -ceed /gpu/cuda
288b648fd31SJed Brown$ cd ..
289bcb2dfaeSJed Brown
290b648fd31SJed Brown$ cd petsc/
291b648fd31SJed Brown$ make
292b648fd31SJed Brown$ ./bpssphere -problem bp1 -ceed /cpu/self
293b648fd31SJed Brown$ ./bpssphere -problem bp2 -ceed /gpu/cuda
294b648fd31SJed Brown$ ./bpssphere -problem bp3 -ceed /cpu/self
295b648fd31SJed Brown$ ./bpssphere -problem bp4 -ceed /gpu/cuda
296b648fd31SJed Brown$ ./bpssphere -problem bp5 -ceed /cpu/self
297b648fd31SJed Brown$ ./bpssphere -problem bp6 -ceed /gpu/cuda
298b648fd31SJed Brown$ cd ..
299bcb2dfaeSJed Brown
300b648fd31SJed Brown$ cd petsc/
301b648fd31SJed Brown$ make
302b648fd31SJed Brown$ ./area -problem cube -ceed /cpu/self -degree 3
303b648fd31SJed Brown$ ./area -problem cube -ceed /gpu/cuda -degree 3
304b648fd31SJed Brown$ ./area -problem sphere -ceed /cpu/self -degree 3 -dm_refine 2
305b648fd31SJed Brown$ ./area -problem sphere -ceed /gpu/cuda -degree 3 -dm_refine 2
306bcb2dfaeSJed Brown
307b648fd31SJed Brown$ cd fluids/
308b648fd31SJed Brown$ make
309b648fd31SJed Brown$ ./navierstokes -ceed /cpu/self -degree 1
310b648fd31SJed Brown$ ./navierstokes -ceed /gpu/cuda -degree 1
311b648fd31SJed Brown$ cd ..
312bcb2dfaeSJed Brown
313b648fd31SJed Brown$ cd solids/
314b648fd31SJed Brown$ make
315b648fd31SJed Brown$ ./elasticity -ceed /cpu/self -mesh [.exo file] -degree 2 -E 1 -nu 0.3 -problem Linear -forcing mms
316b648fd31SJed Brown$ ./elasticity -ceed /gpu/cuda -mesh [.exo file] -degree 2 -E 1 -nu 0.3 -problem Linear -forcing mms
317b648fd31SJed Brown$ cd ..
318bcb2dfaeSJed Brown```
319bcb2dfaeSJed Brown
32017be3a41SJeremy L ThompsonFor the last example shown, sample meshes to be used in place of `[.exo file]` can be found at <https://github.com/jeremylt/ceedSampleMeshes>
321bcb2dfaeSJed Brown
32217be3a41SJeremy L ThompsonThe above code assumes a GPU-capable machine with the CUDA backends enabled.
32317be3a41SJeremy L ThompsonDepending on the available backends, other CEED resource specifiers can be provided with the `-ceed` option.
32417be3a41SJeremy L ThompsonOther command line arguments can be found in [examples/petsc](https://github.com/CEED/libCEED/blob/main/examples/petsc/README.md).
325bcb2dfaeSJed Brown
326525f58efSJeremy L Thompson<!-- running-examples-exclusion -->
327bcb2dfaeSJed Brown
328bcb2dfaeSJed Brown## Benchmarks
329bcb2dfaeSJed Brown
330bcb2dfaeSJed BrownA sequence of benchmarks for all enabled backends can be run using:
331bcb2dfaeSJed Brown
332b648fd31SJed Brown```console
333b648fd31SJed Brown$ make benchmarks
334bcb2dfaeSJed Brown```
335bcb2dfaeSJed Brown
33617be3a41SJeremy L ThompsonThe results from the benchmarks are stored inside the `benchmarks/` directory and they can be viewed using the commands (requires python with matplotlib):
337bcb2dfaeSJed Brown
338b648fd31SJed Brown```console
339b648fd31SJed Brown$ cd benchmarks
340b648fd31SJed Brown$ python postprocess-plot.py petsc-bps-bp1-*-output.txt
341b648fd31SJed Brown$ python postprocess-plot.py petsc-bps-bp3-*-output.txt
342bcb2dfaeSJed Brown```
343bcb2dfaeSJed Brown
34417be3a41SJeremy L ThompsonUsing the `benchmarks` target runs a comprehensive set of benchmarks which may take some time to run.
34517be3a41SJeremy L ThompsonSubsets of the benchmarks can be run using the scripts in the `benchmarks` folder.
346bcb2dfaeSJed Brown
347bcb2dfaeSJed BrownFor more details about the benchmarks, see the `benchmarks/README.md` file.
348bcb2dfaeSJed Brown
349bcb2dfaeSJed Brown## Install
350bcb2dfaeSJed Brown
351bcb2dfaeSJed BrownTo install libCEED, run:
352bcb2dfaeSJed Brown
353b648fd31SJed Brown```console
354b648fd31SJed Brown$ make install prefix=/path/to/install/dir
355bcb2dfaeSJed Brown```
356bcb2dfaeSJed Brown
357bcb2dfaeSJed Brownor (e.g., if creating packages):
358bcb2dfaeSJed Brown
359b648fd31SJed Brown```console
360b648fd31SJed Brown$ make install prefix=/usr DESTDIR=/packaging/path
361bcb2dfaeSJed Brown```
362bcb2dfaeSJed Brown
363d27ed4f3SJeremy L ThompsonTo build and install in separate steps, run:
364d27ed4f3SJeremy L Thompson
365b648fd31SJed Brown```console
366b648fd31SJed Brown$ make for_install=1 prefix=/path/to/install/dir
367b648fd31SJed Brown$ make install prefix=/path/to/install/dir
368d27ed4f3SJeremy L Thompson```
369d27ed4f3SJeremy L Thompson
37017be3a41SJeremy L ThompsonThe usual variables like `CC` and `CFLAGS` are used, and optimization flags for all languages can be set using the likes of `OPT='-O3 -march=native'`.
37117be3a41SJeremy L ThompsonUse `STATIC=1` to build static libraries (`libceed.a`).
372bcb2dfaeSJed Brown
373bcb2dfaeSJed BrownTo install libCEED for Python, run:
374bcb2dfaeSJed Brown
375b648fd31SJed Brown```console
376b648fd31SJed Brown$ pip install libceed
377bcb2dfaeSJed Brown```
378bcb2dfaeSJed Brown
379bcb2dfaeSJed Brownwith the desired setuptools options, such as `--user`.
380bcb2dfaeSJed Brown
381bcb2dfaeSJed Brown### pkg-config
382bcb2dfaeSJed Brown
38317be3a41SJeremy L ThompsonIn addition to library and header, libCEED provides a [pkg-config](https://en.wikipedia.org/wiki/Pkg-config) file that can be used to easily compile and link.
38417be3a41SJeremy L Thompson[For example](https://people.freedesktop.org/~dbn/pkg-config-guide.html#faq), if `$prefix` is a standard location or you set the environment variable `PKG_CONFIG_PATH`:
385bcb2dfaeSJed Brown
386b648fd31SJed Brown```console
387b648fd31SJed Brown$ cc `pkg-config --cflags --libs ceed` -o myapp myapp.c
388bcb2dfaeSJed Brown```
389bcb2dfaeSJed Brown
39017be3a41SJeremy L Thompsonwill build `myapp` with libCEED.
39117be3a41SJeremy L ThompsonThis can be used with the source or installed directories.
39217be3a41SJeremy L ThompsonMost build systems have support for pkg-config.
393bcb2dfaeSJed Brown
394bcb2dfaeSJed Brown## Contact
395bcb2dfaeSJed Brown
39617be3a41SJeremy L ThompsonYou can reach the libCEED team by emailing [ceed-users@llnl.gov](mailto:ceed-users@llnl.gov) or by leaving a comment in the [issue tracker](https://github.com/CEED/libCEED/issues).
397bcb2dfaeSJed Brown
398bcb2dfaeSJed Brown## How to Cite
399bcb2dfaeSJed Brown
400bcb2dfaeSJed BrownIf you utilize libCEED please cite:
401bcb2dfaeSJed Brown
402b648fd31SJed Brown```bibtex
403bcb2dfaeSJed Brown@article{libceed-joss-paper,
404*334b1dcfSJeremy L Thompson  author       = {
405*334b1dcfSJeremy L Thompson    Brown, Jed and
406*334b1dcfSJeremy L Thompson    Abdelfattah, Ahmad and
407*334b1dcfSJeremy L Thompson    Barra, Valeria and
408*334b1dcfSJeremy L Thompson    Beams, Natalie and
409*334b1dcfSJeremy L Thompson    Camier, Jean-Sylvain and
410*334b1dcfSJeremy L Thompson    Dobrev, Veselin and
411*334b1dcfSJeremy L Thompson    Dudouit, Yohann and
412*334b1dcfSJeremy L Thompson    Ghaffari, Leila and
413*334b1dcfSJeremy L Thompson    Kolev, Tzanio and
414*334b1dcfSJeremy L Thompson    Medina, David and
415*334b1dcfSJeremy L Thompson    Pazner, Will and
416*334b1dcfSJeremy L Thompson    Ratnayaka, Thilina and
417*334b1dcfSJeremy L Thompson    Thompson, Jeremy L. and
418*334b1dcfSJeremy L Thompson    Tomov, Stan
419*334b1dcfSJeremy L Thompson  },
420bcb2dfaeSJed Brown  title        = {{libCEED}: Fast algebra for high-order element-based discretizations},
421bcb2dfaeSJed Brown  journal      = {Journal of Open Source Software},
422bcb2dfaeSJed Brown  year         = {2021},
423bcb2dfaeSJed Brown  publisher    = {The Open Journal},
424bcb2dfaeSJed Brown  volume       = {6},
425bcb2dfaeSJed Brown  number       = {63},
426bcb2dfaeSJed Brown  pages        = {2945},
427bcb2dfaeSJed Brown  doi          = {10.21105/joss.02945}
428bcb2dfaeSJed Brown}
42987a4ead5SJeremy L Thompson```
430bcb2dfaeSJed Brown
43187a4ead5SJeremy L ThompsonThe archival copy of the libCEED user manual is maintained on [Zenodo](https://doi.org/10.5281/zenodo.4302736).
43287a4ead5SJeremy L ThompsonTo cite the user manual:
43387a4ead5SJeremy L Thompson
43487a4ead5SJeremy L Thompson```bibtex
435bcb2dfaeSJed Brown@misc{libceed-user-manual,
436*334b1dcfSJeremy L Thompson  author       = {
437*334b1dcfSJeremy L Thompson    Abdelfattah, Ahmad and
438bcb2dfaeSJed Brown    Barra, Valeria and
439bcb2dfaeSJed Brown    Beams, Natalie and
440bcb2dfaeSJed Brown    Brown, Jed and
441bcb2dfaeSJed Brown    Camier, Jean-Sylvain and
442bcb2dfaeSJed Brown    Dobrev, Veselin and
443bcb2dfaeSJed Brown    Dudouit, Yohann and
444bcb2dfaeSJed Brown    Ghaffari, Leila and
4451ea55a34SJed Brown    Grimberg, Sebastian and
446bcb2dfaeSJed Brown    Kolev, Tzanio and
447bcb2dfaeSJed Brown    Medina, David and
448bcb2dfaeSJed Brown    Pazner, Will and
449bcb2dfaeSJed Brown    Ratnayaka, Thilina and
450a85b61d6SJeremy L Thompson    Shakeri, Rezgar and
451*334b1dcfSJeremy L Thompson    Thompson, Jeremy L. and
452a85b61d6SJeremy L Thompson    Tomov, Stanimire and
453*334b1dcfSJeremy L Thompson    Wright III, James
454*334b1dcfSJeremy L Thompson  },
455bcb2dfaeSJed Brown  title        = {{libCEED} User Manual},
4561ea55a34SJed Brown  month        = nov,
4571ea55a34SJed Brown  year         = 2023,
458bcb2dfaeSJed Brown  publisher    = {Zenodo},
4591ea55a34SJed Brown  version      = {0.12.0},
4601ea55a34SJed Brown  doi          = {10.5281/zenodo.10062388}
461bcb2dfaeSJed Brown}
462bcb2dfaeSJed Brown```
463bcb2dfaeSJed Brown
464bcb2dfaeSJed BrownFor libCEED's Python interface please cite:
465bcb2dfaeSJed Brown
466b648fd31SJed Brown```bibtex
467*334b1dcfSJeremy L Thompson@InProceedings{libceed-scipy,
468*334b1dcfSJeremy L Thompson  author    = {
469*334b1dcfSJeremy L Thompson    Barra, Valeria and
470*334b1dcfSJeremy L Thompson    Brown, Jed and
471*334b1dcfSJeremy L Thompson    Thompson, Jeremy L. and
472*334b1dcfSJeremy L Thompson    Dudouit, Yohann
473*334b1dcfSJeremy L Thompson  },
474*334b1dcfSJeremy L Thompson  title     = {{H}igh-performance operator evaluations with ease of use: {libCEED}'s {P}ython interface},
475bcb2dfaeSJed Brown  booktitle = {{P}roceedings of the 19th {P}ython in {S}cience {C}onference},
476bcb2dfaeSJed Brown  pages     = {85 - 90},
477bcb2dfaeSJed Brown  year      = {2020},
478bcb2dfaeSJed Brown  editor    = {{M}eghann {A}garwal and {C}hris {C}alloway and {D}illon {N}iederhut and {D}avid {S}hupe},
479bcb2dfaeSJed Brown  doi       = {10.25080/Majora-342d178e-00c}
480bcb2dfaeSJed Brown}
481bcb2dfaeSJed Brown```
482bcb2dfaeSJed Brown
483b648fd31SJed BrownThe BibTeX entries for these references can be found in the `doc/bib/references.bib` file.
484bcb2dfaeSJed Brown
485bcb2dfaeSJed Brown## Copyright
486bcb2dfaeSJed Brown
48717be3a41SJeremy L ThompsonThe following copyright applies to each file in the CEED software suite, unless otherwise stated in the file:
488bcb2dfaeSJed Brown
4899ba83ac0SJeremy L Thompson> Copyright (c) 2017-2026, Lawrence Livermore National Security, LLC and other CEED contributors.
49083b45269SJeremy L Thompson> All rights reserved.
491bcb2dfaeSJed Brown
492bcb2dfaeSJed BrownSee files LICENSE and NOTICE for details.
493d3fde3fbSJed Brown
494d3fde3fbSJed Brown[github-badge]: https://github.com/CEED/libCEED/workflows/C/Fortran/badge.svg
495d3fde3fbSJed Brown[github-link]: https://github.com/CEED/libCEED/actions
496d3fde3fbSJed Brown[gitlab-badge]: https://gitlab.com/libceed/libCEED/badges/main/pipeline.svg?key_text=GitLab-CI
497d3fde3fbSJed Brown[gitlab-link]: https://gitlab.com/libceed/libCEED/-/pipelines?page=1&scope=all&ref=main
498d3fde3fbSJed Brown[codecov-badge]: https://codecov.io/gh/CEED/libCEED/branch/main/graphs/badge.svg
499d3fde3fbSJed Brown[codecov-link]: https://codecov.io/gh/CEED/libCEED/
500d3fde3fbSJed Brown[license-badge]: https://img.shields.io/badge/License-BSD%202--Clause-orange.svg
501d3fde3fbSJed Brown[license-link]: https://opensource.org/licenses/BSD-2-Clause
502d3fde3fbSJed Brown[doc-badge]: https://readthedocs.org/projects/libceed/badge/?version=latest
50313964f07SJed Brown[doc-link]: https://libceed.org/en/latest/?badge=latest
504d3fde3fbSJed Brown[joss-badge]: https://joss.theoj.org/papers/10.21105/joss.02945/status.svg
505d3fde3fbSJed Brown[joss-link]: https://doi.org/10.21105/joss.02945
506d3fde3fbSJed Brown[binder-badge]: http://mybinder.org/badge_logo.svg
5071bd2483cSJeremy L Thompson[binder-link]: https://mybinder.org/v2/gh/CEED/libCEED/main?urlpath=lab/tree/examples/python/tutorial-0-ceed.ipynb
508