xref: /libCEED/README.md (revision d27ed4f31dc2f2431d97e2d56fc3a37c879fb4b3)
1bcb2dfaeSJed Brown# libCEED: Efficient Extensible Discretization
2bcb2dfaeSJed Brown
3d3fde3fbSJed Brown[![GitHub Actions][github-badge]][github-link]
4d3fde3fbSJed Brown[![GitLab-CI][gitlab-badge]][gitlab-link]
5d3fde3fbSJed Brown[![Code coverage][codecov-badge]][codecov-link]
6d3fde3fbSJed Brown[![BSD-2-Clause][license-badge]][license-link]
7d3fde3fbSJed Brown[![Documentation][doc-badge]][doc-link]
8d3fde3fbSJed Brown[![JOSS paper][joss-badge]][joss-link]
9d3fde3fbSJed Brown[![Binder][binder-badge]][binder-link]
10bcb2dfaeSJed Brown
11bcb2dfaeSJed Brown## Summary and Purpose
12bcb2dfaeSJed Brown
13bcb2dfaeSJed BrownlibCEED provides fast algebra for element-based discretizations, designed for
14bcb2dfaeSJed Brownperformance portability, run-time flexibility, and clean embedding in higher
15bcb2dfaeSJed Brownlevel libraries and applications. It offers a C99 interface as well as bindings
16bcb2dfaeSJed Brownfor Fortran, Python, Julia, and Rust.
17bcb2dfaeSJed BrownWhile our focus is on high-order finite elements, the approach is mostly
18bcb2dfaeSJed Brownalgebraic and thus applicable to other discretizations in factored form, as
1913964f07SJed Brownexplained in the [user manual](https://libceed.org/en/latest/) and
20bcb2dfaeSJed BrownAPI implementation portion of the
2113964f07SJed Brown[documentation](https://libceed.org/en/latest/api/).
22bcb2dfaeSJed Brown
23bcb2dfaeSJed BrownOne of the challenges with high-order methods is that a global sparse matrix is
24bcb2dfaeSJed Brownno longer a good representation of a high-order linear operator, both with
25bcb2dfaeSJed Brownrespect to the FLOPs needed for its evaluation, as well as the memory transfer
26bcb2dfaeSJed Brownneeded for a matvec.  Thus, high-order methods require a new "format" that still
27bcb2dfaeSJed Brownrepresents a linear (or more generally non-linear) operator, but not through a
28bcb2dfaeSJed Brownsparse matrix.
29bcb2dfaeSJed Brown
30bcb2dfaeSJed BrownThe goal of libCEED is to propose such a format, as well as supporting
31bcb2dfaeSJed Brownimplementations and data structures, that enable efficient operator evaluation
32bcb2dfaeSJed Brownon a variety of computational device types (CPUs, GPUs, etc.). This new operator
33bcb2dfaeSJed Browndescription is based on algebraically
3413964f07SJed Brown[factored form](https://libceed.org/en/latest/libCEEDapi/#finite-element-operator-decomposition),
35bcb2dfaeSJed Brownwhich is easy to incorporate in a wide variety of applications, without significant
36bcb2dfaeSJed Brownrefactoring of their own discretization infrastructure.
37bcb2dfaeSJed Brown
38bcb2dfaeSJed BrownThe repository is part of the
39bcb2dfaeSJed Brown[CEED software suite](http://ceed.exascaleproject.org/software/), a collection of
40bcb2dfaeSJed Brownsoftware benchmarks, miniapps, libraries and APIs for efficient exascale
41bcb2dfaeSJed Browndiscretizations based on high-order finite element and spectral element methods.
42bcb2dfaeSJed BrownSee <http://github.com/ceed> for more information and source code availability.
43bcb2dfaeSJed Brown
44bcb2dfaeSJed BrownThe CEED research is supported by the
45bcb2dfaeSJed Brown[Exascale Computing Project](https://exascaleproject.org/exascale-computing-project)
46bcb2dfaeSJed Brown(17-SC-20-SC), a collaborative effort of two U.S. Department of Energy
47bcb2dfaeSJed Brownorganizations (Office of Science and the National Nuclear Security
48bcb2dfaeSJed BrownAdministration) responsible for the planning and preparation of a
49bcb2dfaeSJed Brown[capable exascale ecosystem](https://exascaleproject.org/what-is-exascale), including
50bcb2dfaeSJed Brownsoftware, applications, hardware, advanced system engineering and early testbed
51bcb2dfaeSJed Brownplatforms, in support of the nation’s exascale computing imperative.
52bcb2dfaeSJed Brown
5313964f07SJed BrownFor more details on the CEED API see the [user manual](https://libceed.org/en/latest/).
54bcb2dfaeSJed Brown
55bcb2dfaeSJed Brown% gettingstarted-inclusion-marker
56bcb2dfaeSJed Brown
57bcb2dfaeSJed Brown## Building
58bcb2dfaeSJed Brown
59bcb2dfaeSJed BrownThe CEED library, `libceed`, is a C99 library with no required dependencies, and
60bcb2dfaeSJed Brownwith Fortran, Python, Julia, and Rust interfaces.  It can be built using:
61bcb2dfaeSJed Brown
62bcb2dfaeSJed Brown```
63bcb2dfaeSJed Brownmake
64bcb2dfaeSJed Brown```
65bcb2dfaeSJed Brown
66bcb2dfaeSJed Brownor, with optimization flags:
67bcb2dfaeSJed Brown
68bcb2dfaeSJed Brown```
69bcb2dfaeSJed Brownmake OPT='-O3 -march=skylake-avx512 -ffp-contract=fast'
70bcb2dfaeSJed Brown```
71bcb2dfaeSJed Brown
72bcb2dfaeSJed BrownThese optimization flags are used by all languages (C, C++, Fortran) and this
73bcb2dfaeSJed Brownmakefile variable can also be set for testing and examples (below).
74bcb2dfaeSJed Brown
75bcb2dfaeSJed BrownThe library attempts to automatically detect support for the AVX
76bcb2dfaeSJed Browninstruction set using gcc-style compiler options for the host.
77bcb2dfaeSJed BrownSupport may need to be manually specified via:
78bcb2dfaeSJed Brown
79bcb2dfaeSJed Brown```
80bcb2dfaeSJed Brownmake AVX=1
81bcb2dfaeSJed Brown```
82bcb2dfaeSJed Brown
83bcb2dfaeSJed Brownor:
84bcb2dfaeSJed Brown
85bcb2dfaeSJed Brown```
86bcb2dfaeSJed Brownmake AVX=0
87bcb2dfaeSJed Brown```
88bcb2dfaeSJed Brown
89bcb2dfaeSJed Brownif your compiler does not support gcc-style options, if you are cross
90bcb2dfaeSJed Browncompiling, etc.
91bcb2dfaeSJed Brown
92bcb2dfaeSJed BrownTo enable CUDA support, add `CUDA_DIR=/opt/cuda` or an appropriate directory
93bcb2dfaeSJed Brownto your `make` invocation. To enable HIP support, add `HIP_DIR=/opt/rocm` or
94bcb2dfaeSJed Brownan appropriate directory. To store these or other arguments as defaults for
95bcb2dfaeSJed Brownfuture invocations of `make`, use:
96bcb2dfaeSJed Brown
97bcb2dfaeSJed Brown```
98bcb2dfaeSJed Brownmake configure CUDA_DIR=/usr/local/cuda HIP_DIR=/opt/rocm OPT='-O3 -march=znver2'
99bcb2dfaeSJed Brown```
100bcb2dfaeSJed Brown
101bcb2dfaeSJed Brownwhich stores these variables in `config.mk`.
102bcb2dfaeSJed Brown
103bcb2dfaeSJed Brown## Additional Language Interfaces
104bcb2dfaeSJed Brown
105bcb2dfaeSJed BrownThe Fortran interface is built alongside the library automatically.
106bcb2dfaeSJed Brown
107bcb2dfaeSJed BrownPython users can install using:
108bcb2dfaeSJed Brown
109bcb2dfaeSJed Brown```
110bcb2dfaeSJed Brownpip install libceed
111bcb2dfaeSJed Brown```
112bcb2dfaeSJed Brown
113bcb2dfaeSJed Brownor in a clone of the repository via `pip install .`.
114bcb2dfaeSJed Brown
115bcb2dfaeSJed BrownJulia users can install using:
116bcb2dfaeSJed Brown
117bcb2dfaeSJed Brown```
118bcb2dfaeSJed Brown$ julia
119bcb2dfaeSJed Brownjulia> ]
120bcb2dfaeSJed Brownpkg> add LibCEED
121bcb2dfaeSJed Brown```
122bcb2dfaeSJed Brown
123186a1480SWill PaznerSee the [LibCEED.jl documentation](http://ceed.exascaleproject.org/libCEED-julia-docs/dev/)
124186a1480SWill Paznerfor more information.
125bcb2dfaeSJed Brown
126bcb2dfaeSJed BrownRust users can include libCEED via `Cargo.toml`:
127bcb2dfaeSJed Brown
128bcb2dfaeSJed Brown```toml
129bcb2dfaeSJed Brown[dependencies]
130bcb2dfaeSJed Brownlibceed = { git = "https://github.com/CEED/libCEED", branch = "main" }
131bcb2dfaeSJed Brown```
132bcb2dfaeSJed Brown
133bcb2dfaeSJed BrownSee the [Cargo documentation](https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html#specifying-dependencies-from-git-repositories) for details.
134bcb2dfaeSJed Brown
135bcb2dfaeSJed Brown## Testing
136bcb2dfaeSJed Brown
137bcb2dfaeSJed BrownThe test suite produces [TAP](https://testanything.org) output and is run by:
138bcb2dfaeSJed Brown
139bcb2dfaeSJed Brown```
140bcb2dfaeSJed Brownmake test
141bcb2dfaeSJed Brown```
142bcb2dfaeSJed Brown
143bcb2dfaeSJed Brownor, using the `prove` tool distributed with Perl (recommended):
144bcb2dfaeSJed Brown
145bcb2dfaeSJed Brown```
146bcb2dfaeSJed Brownmake prove
147bcb2dfaeSJed Brown```
148bcb2dfaeSJed Brown
149bcb2dfaeSJed Brown## Backends
150bcb2dfaeSJed Brown
151bcb2dfaeSJed BrownThere are multiple supported backends, which can be selected at runtime in the examples:
152bcb2dfaeSJed Brown
153bcb2dfaeSJed Brown| CEED resource              | Backend                                           | Deterministic Capable |
154d3fde3fbSJed Brown| :---                       | :---                                              | :---:                 |
155d3fde3fbSJed Brown||
156d3fde3fbSJed Brown| **CPU Native**             |
157d3fde3fbSJed Brown| `/cpu/self/ref/serial`     | Serial reference implementation                   | Yes                   |
158d3fde3fbSJed Brown| `/cpu/self/ref/blocked`    | Blocked reference implementation                  | Yes                   |
159d3fde3fbSJed Brown| `/cpu/self/opt/serial`     | Serial optimized C implementation                 | Yes                   |
160d3fde3fbSJed Brown| `/cpu/self/opt/blocked`    | Blocked optimized C implementation                | Yes                   |
161d3fde3fbSJed Brown| `/cpu/self/avx/serial`     | Serial AVX implementation                         | Yes                   |
162d3fde3fbSJed Brown| `/cpu/self/avx/blocked`    | Blocked AVX implementation                        | Yes                   |
163d3fde3fbSJed Brown||
164d3fde3fbSJed Brown| **CPU Valgrind**           |
165d3fde3fbSJed Brown| `/cpu/self/memcheck/*`     | Memcheck backends, undefined value checks         | Yes                   |
166d3fde3fbSJed Brown||
167d3fde3fbSJed Brown| **CPU LIBXSMM**            |
168d3fde3fbSJed Brown| `/cpu/self/xsmm/serial`    | Serial LIBXSMM implementation                     | Yes                   |
169d3fde3fbSJed Brown| `/cpu/self/xsmm/blocked`   | Blocked LIBXSMM implementation                    | Yes                   |
170d3fde3fbSJed Brown||
171d3fde3fbSJed Brown| **CUDA Native**            |
172d3fde3fbSJed Brown| `/gpu/cuda/ref`            | Reference pure CUDA kernels                       | Yes                   |
173d3fde3fbSJed Brown| `/gpu/cuda/shared`         | Optimized pure CUDA kernels using shared memory   | Yes                   |
174d3fde3fbSJed Brown| `/gpu/cuda/gen`            | Optimized pure CUDA kernels using code generation | No                    |
175d3fde3fbSJed Brown||
176d3fde3fbSJed Brown| **HIP Native**             |
177d3fde3fbSJed Brown| `/gpu/hip/ref`             | Reference pure HIP kernels                        | Yes                   |
178d3fde3fbSJed Brown| `/gpu/hip/shared`          | Optimized pure HIP kernels using shared memory    | Yes                   |
179d3fde3fbSJed Brown| `/gpu/hip/gen`             | Optimized pure HIP kernels using code generation  | No                    |
180d3fde3fbSJed Brown||
181d3fde3fbSJed Brown| **MAGMA**                  |
182d3fde3fbSJed Brown| `/gpu/cuda/magma`          | CUDA MAGMA kernels                                | No                    |
183d3fde3fbSJed Brown| `/gpu/cuda/magma/det`      | CUDA MAGMA kernels                                | Yes                   |
184d3fde3fbSJed Brown| `/gpu/hip/magma`           | HIP MAGMA kernels                                 | No                    |
185d3fde3fbSJed Brown| `/gpu/hip/magma/det`       | HIP MAGMA kernels                                 | Yes                   |
186d3fde3fbSJed Brown||
187d3fde3fbSJed Brown| **OCCA**                   |
188d3fde3fbSJed Brown| `/*/occa`                  | Selects backend based on available OCCA modes     | Yes                   |
189d3fde3fbSJed Brown| `/cpu/self/occa`           | OCCA backend with serial CPU kernels              | Yes                   |
190d3fde3fbSJed Brown| `/cpu/openmp/occa`         | OCCA backend with OpenMP kernels                  | Yes                   |
191d3fde3fbSJed Brown| `/gpu/cuda/occa`           | OCCA backend with CUDA kernels                    | Yes                   |
192d3fde3fbSJed Brown| `/gpu/hip/occa`~           | OCCA backend with HIP kernels                     | Yes                   |
193bcb2dfaeSJed Brown
194bcb2dfaeSJed BrownThe `/cpu/self/*/serial` backends process one element at a time and are intended for meshes
195bcb2dfaeSJed Brownwith a smaller number of high order elements. The `/cpu/self/*/blocked` backends process
196bcb2dfaeSJed Brownblocked batches of eight interlaced elements and are intended for meshes with higher numbers
197bcb2dfaeSJed Brownof elements.
198bcb2dfaeSJed Brown
199bcb2dfaeSJed BrownThe `/cpu/self/ref/*` backends are written in pure C and provide basic functionality.
200bcb2dfaeSJed Brown
201bcb2dfaeSJed BrownThe `/cpu/self/opt/*` backends are written in pure C and use partial e-vectors to improve performance.
202bcb2dfaeSJed Brown
203bcb2dfaeSJed BrownThe `/cpu/self/avx/*` backends rely upon AVX instructions to provide vectorized CPU performance.
204bcb2dfaeSJed Brown
205bcb2dfaeSJed BrownThe `/cpu/self/memcheck/*` backends rely upon the [Valgrind](http://valgrind.org/) Memcheck tool
206bcb2dfaeSJed Brownto help verify that user QFunctions have no undefined values. To use, run your code with
207bcb2dfaeSJed BrownValgrind and the Memcheck backends, e.g. `valgrind ./build/ex1 -ceed /cpu/self/ref/memcheck`. A
208bcb2dfaeSJed Brown'development' or 'debugging' version of Valgrind with headers is required to use this backend.
209bcb2dfaeSJed BrownThis backend can be run in serial or blocked mode and defaults to running in the serial mode
210bcb2dfaeSJed Brownif `/cpu/self/memcheck` is selected at runtime.
211bcb2dfaeSJed Brown
212bcb2dfaeSJed BrownThe `/cpu/self/xsmm/*` backends rely upon the [LIBXSMM](http://github.com/hfp/libxsmm) package
213bcb2dfaeSJed Brownto provide vectorized CPU performance. If linking MKL and LIBXSMM is desired but
214bcb2dfaeSJed Brownthe Makefile is not detecting `MKLROOT`, linking libCEED against MKL can be
215bcb2dfaeSJed Brownforced by setting the environment variable `MKL=1`.
216bcb2dfaeSJed Brown
217bcb2dfaeSJed BrownThe `/gpu/cuda/*` backends provide GPU performance strictly using CUDA.
218bcb2dfaeSJed Brown
219bcb2dfaeSJed BrownThe `/gpu/hip/*` backends provide GPU performance strictly using HIP. They are based on
220f577dd42Snbeamsthe `/gpu/cuda/*` backends.  ROCm version 4.2 or newer is required.
221bcb2dfaeSJed Brown
222bcb2dfaeSJed BrownThe `/gpu/*/magma/*` backends rely upon the [MAGMA](https://bitbucket.org/icl/magma) package.
223bcb2dfaeSJed BrownTo enable the MAGMA backends, the environment variable `MAGMA_DIR` must point to the top-level
224bcb2dfaeSJed BrownMAGMA directory, with the MAGMA library located in `$(MAGMA_DIR)/lib/`.
225bcb2dfaeSJed BrownBy default, `MAGMA_DIR` is set to `../magma`; to build the MAGMA backends
226bcb2dfaeSJed Brownwith a MAGMA installation located elsewhere, create a link to `magma/` in libCEED's parent
227bcb2dfaeSJed Browndirectory, or set `MAGMA_DIR` to the proper location.  MAGMA version 2.5.0 or newer is required.
228bcb2dfaeSJed BrownCurrently, each MAGMA library installation is only built for either CUDA or HIP.  The corresponding
229bcb2dfaeSJed Brownset of libCEED backends (`/gpu/cuda/magma/*` or `/gpu/hip/magma/*`) will automatically be built
230bcb2dfaeSJed Brownfor the version of the MAGMA library found in `MAGMA_DIR`.
231bcb2dfaeSJed Brown
232bcb2dfaeSJed BrownUsers can specify a device for all CUDA, HIP, and MAGMA backends through adding `:device_id=#`
233bcb2dfaeSJed Brownafter the resource name.  For example:
234bcb2dfaeSJed Brown
235bcb2dfaeSJed Brown> - `/gpu/cuda/gen:device_id=1`
236bcb2dfaeSJed Brown
237bcb2dfaeSJed BrownThe `/*/occa` backends rely upon the [OCCA](http://github.com/libocca/occa) package to provide
238bcb2dfaeSJed Browncross platform performance. To enable the OCCA backend, the environment variable `OCCA_DIR` must point
239bcb2dfaeSJed Brownto the top-level OCCA directory, with the OCCA library located in the `${OCCA_DIR}/lib` (By default,
240bcb2dfaeSJed Brown`OCCA_DIR` is set to `../occa`).
241bcb2dfaeSJed Brown
242bcb2dfaeSJed BrownAdditionally, users can pass specific OCCA device properties after setting the CEED resource.
243bcb2dfaeSJed BrownFor example:
244bcb2dfaeSJed Brown
245bcb2dfaeSJed Brown> - `"/*/occa:mode='CUDA',device_id=0"`
246bcb2dfaeSJed Brown
247bcb2dfaeSJed BrownBit-for-bit reproducibility is important in some applications.
248bcb2dfaeSJed BrownHowever, some libCEED backends use non-deterministic operations, such as `atomicAdd` for increased performance.
249bcb2dfaeSJed BrownThe backends which are capable of generating reproducible results, with the proper compilation options, are highlighted in the list above.
250bcb2dfaeSJed Brown
251bcb2dfaeSJed Brown## Examples
252bcb2dfaeSJed Brown
253bcb2dfaeSJed BrownlibCEED comes with several examples of its usage, ranging from standalone C
254bcb2dfaeSJed Browncodes in the `/examples/ceed` directory to examples based on external packages,
255bcb2dfaeSJed Brownsuch as MFEM, PETSc, and Nek5000. Nek5000 v18.0 or greater is required.
256bcb2dfaeSJed Brown
257bcb2dfaeSJed BrownTo build the examples, set the `MFEM_DIR`, `PETSC_DIR`, and
258bcb2dfaeSJed Brown`NEK5K_DIR` variables and run:
259bcb2dfaeSJed Brown
260bcb2dfaeSJed Brown```
261bcb2dfaeSJed Browncd examples/
262bcb2dfaeSJed Brown```
263bcb2dfaeSJed Brown
264bcb2dfaeSJed Brown% running-examples-inclusion-marker
265bcb2dfaeSJed Brown
266bcb2dfaeSJed Brown```console
267bcb2dfaeSJed Brown# libCEED examples on CPU and GPU
268bcb2dfaeSJed Browncd ceed/
269bcb2dfaeSJed Brownmake
270bcb2dfaeSJed Brown./ex1-volume -ceed /cpu/self
271bcb2dfaeSJed Brown./ex1-volume -ceed /gpu/cuda
272bcb2dfaeSJed Brown./ex2-surface -ceed /cpu/self
273bcb2dfaeSJed Brown./ex2-surface -ceed /gpu/cuda
274bcb2dfaeSJed Browncd ..
275bcb2dfaeSJed Brown
276bcb2dfaeSJed Brown# MFEM+libCEED examples on CPU and GPU
277bcb2dfaeSJed Browncd mfem/
278bcb2dfaeSJed Brownmake
279bcb2dfaeSJed Brown./bp1 -ceed /cpu/self -no-vis
280bcb2dfaeSJed Brown./bp3 -ceed /gpu/cuda -no-vis
281bcb2dfaeSJed Browncd ..
282bcb2dfaeSJed Brown
283bcb2dfaeSJed Brown# Nek5000+libCEED examples on CPU and GPU
284bcb2dfaeSJed Browncd nek/
285bcb2dfaeSJed Brownmake
286bcb2dfaeSJed Brown./nek-examples.sh -e bp1 -ceed /cpu/self -b 3
287bcb2dfaeSJed Brown./nek-examples.sh -e bp3 -ceed /gpu/cuda -b 3
288bcb2dfaeSJed Browncd ..
289bcb2dfaeSJed Brown
290bcb2dfaeSJed Brown# PETSc+libCEED examples on CPU and GPU
291bcb2dfaeSJed Browncd petsc/
292bcb2dfaeSJed Brownmake
293bcb2dfaeSJed Brown./bps -problem bp1 -ceed /cpu/self
294bcb2dfaeSJed Brown./bps -problem bp2 -ceed /gpu/cuda
295bcb2dfaeSJed Brown./bps -problem bp3 -ceed /cpu/self
296bcb2dfaeSJed Brown./bps -problem bp4 -ceed /gpu/cuda
297bcb2dfaeSJed Brown./bps -problem bp5 -ceed /cpu/self
298bcb2dfaeSJed Brown./bps -problem bp6 -ceed /gpu/cuda
299bcb2dfaeSJed Browncd ..
300bcb2dfaeSJed Brown
301bcb2dfaeSJed Browncd petsc/
302bcb2dfaeSJed Brownmake
303bcb2dfaeSJed Brown./bpsraw -problem bp1 -ceed /cpu/self
304bcb2dfaeSJed Brown./bpsraw -problem bp2 -ceed /gpu/cuda
305bcb2dfaeSJed Brown./bpsraw -problem bp3 -ceed /cpu/self
306bcb2dfaeSJed Brown./bpsraw -problem bp4 -ceed /gpu/cuda
307bcb2dfaeSJed Brown./bpsraw -problem bp5 -ceed /cpu/self
308bcb2dfaeSJed Brown./bpsraw -problem bp6 -ceed /gpu/cuda
309bcb2dfaeSJed Browncd ..
310bcb2dfaeSJed Brown
311bcb2dfaeSJed Browncd petsc/
312bcb2dfaeSJed Brownmake
313bcb2dfaeSJed Brown./bpssphere -problem bp1 -ceed /cpu/self
314bcb2dfaeSJed Brown./bpssphere -problem bp2 -ceed /gpu/cuda
315bcb2dfaeSJed Brown./bpssphere -problem bp3 -ceed /cpu/self
316bcb2dfaeSJed Brown./bpssphere -problem bp4 -ceed /gpu/cuda
317bcb2dfaeSJed Brown./bpssphere -problem bp5 -ceed /cpu/self
318bcb2dfaeSJed Brown./bpssphere -problem bp6 -ceed /gpu/cuda
319bcb2dfaeSJed Browncd ..
320bcb2dfaeSJed Brown
321bcb2dfaeSJed Browncd petsc/
322bcb2dfaeSJed Brownmake
323bcb2dfaeSJed Brown./area -problem cube -ceed /cpu/self -degree 3
324bcb2dfaeSJed Brown./area -problem cube -ceed /gpu/cuda -degree 3
325bcb2dfaeSJed Brown./area -problem sphere -ceed /cpu/self -degree 3 -dm_refine 2
326bcb2dfaeSJed Brown./area -problem sphere -ceed /gpu/cuda -degree 3 -dm_refine 2
327bcb2dfaeSJed Brown
328bcb2dfaeSJed Browncd fluids/
329bcb2dfaeSJed Brownmake
330bcb2dfaeSJed Brown./navierstokes -ceed /cpu/self -degree 1
331bcb2dfaeSJed Brown./navierstokes -ceed /gpu/cuda -degree 1
332bcb2dfaeSJed Browncd ..
333bcb2dfaeSJed Brown
334bcb2dfaeSJed Browncd solids/
335bcb2dfaeSJed Brownmake
336bcb2dfaeSJed Brown./elasticity -ceed /cpu/self -mesh [.exo file] -degree 2 -E 1 -nu 0.3 -problem Linear -forcing mms
337bcb2dfaeSJed Brown./elasticity -ceed /gpu/cuda -mesh [.exo file] -degree 2 -E 1 -nu 0.3 -problem Linear -forcing mms
338bcb2dfaeSJed Browncd ..
339bcb2dfaeSJed Brown```
340bcb2dfaeSJed Brown
341bcb2dfaeSJed BrownFor the last example shown, sample meshes to be used in place of
342bcb2dfaeSJed Brown`[.exo file]` can be found at <https://github.com/jeremylt/ceedSampleMeshes>
343bcb2dfaeSJed Brown
344bcb2dfaeSJed BrownThe above code assumes a GPU-capable machine with the OCCA backend
345bcb2dfaeSJed Brownenabled. Depending on the available backends, other CEED resource
346bcb2dfaeSJed Brownspecifiers can be provided with the `-ceed` option. Other command line
347bcb2dfaeSJed Brownarguments can be found in [examples/petsc](https://github.com/CEED/libCEED/blob/main/examples/petsc/README.md).
348bcb2dfaeSJed Brown
349bcb2dfaeSJed Brown% benchmarks-marker
350bcb2dfaeSJed Brown
351bcb2dfaeSJed Brown## Benchmarks
352bcb2dfaeSJed Brown
353bcb2dfaeSJed BrownA sequence of benchmarks for all enabled backends can be run using:
354bcb2dfaeSJed Brown
355bcb2dfaeSJed Brown```
356bcb2dfaeSJed Brownmake benchmarks
357bcb2dfaeSJed Brown```
358bcb2dfaeSJed Brown
359bcb2dfaeSJed BrownThe results from the benchmarks are stored inside the `benchmarks/` directory
360bcb2dfaeSJed Brownand they can be viewed using the commands (requires python with matplotlib):
361bcb2dfaeSJed Brown
362bcb2dfaeSJed Brown```
363bcb2dfaeSJed Browncd benchmarks
364bcb2dfaeSJed Brownpython postprocess-plot.py petsc-bps-bp1-*-output.txt
365bcb2dfaeSJed Brownpython postprocess-plot.py petsc-bps-bp3-*-output.txt
366bcb2dfaeSJed Brown```
367bcb2dfaeSJed Brown
368bcb2dfaeSJed BrownUsing the `benchmarks` target runs a comprehensive set of benchmarks which may
369bcb2dfaeSJed Browntake some time to run. Subsets of the benchmarks can be run using the scripts in the `benchmarks` folder.
370bcb2dfaeSJed Brown
371bcb2dfaeSJed BrownFor more details about the benchmarks, see the `benchmarks/README.md` file.
372bcb2dfaeSJed Brown
373bcb2dfaeSJed Brown## Install
374bcb2dfaeSJed Brown
375bcb2dfaeSJed BrownTo install libCEED, run:
376bcb2dfaeSJed Brown
377bcb2dfaeSJed Brown```
378*d27ed4f3SJeremy L Thompsonmake install prefix=/path/to/install/dir
379bcb2dfaeSJed Brown```
380bcb2dfaeSJed Brown
381bcb2dfaeSJed Brownor (e.g., if creating packages):
382bcb2dfaeSJed Brown
383bcb2dfaeSJed Brown```
384bcb2dfaeSJed Brownmake install prefix=/usr DESTDIR=/packaging/path
385bcb2dfaeSJed Brown```
386bcb2dfaeSJed Brown
387*d27ed4f3SJeremy L ThompsonTo build and install in separate steps, run:
388*d27ed4f3SJeremy L Thompson
389*d27ed4f3SJeremy L Thompson```
390*d27ed4f3SJeremy L Thompsonmake for_install=1 prefix=/path/to/install/dir
391*d27ed4f3SJeremy L Thompsonmake install prefix=/path/to/install/dir
392*d27ed4f3SJeremy L Thompson```
393*d27ed4f3SJeremy L Thompson
394bcb2dfaeSJed BrownThe usual variables like `CC` and `CFLAGS` are used, and optimization flags
395bcb2dfaeSJed Brownfor all languages can be set using the likes of `OPT='-O3 -march=native'`. Use
396bcb2dfaeSJed Brown`STATIC=1` to build static libraries (`libceed.a`).
397bcb2dfaeSJed Brown
398bcb2dfaeSJed BrownTo install libCEED for Python, run:
399bcb2dfaeSJed Brown
400bcb2dfaeSJed Brown```
401bcb2dfaeSJed Brownpip install libceed
402bcb2dfaeSJed Brown```
403bcb2dfaeSJed Brown
404bcb2dfaeSJed Brownwith the desired setuptools options, such as `--user`.
405bcb2dfaeSJed Brown
406bcb2dfaeSJed Brown### pkg-config
407bcb2dfaeSJed Brown
408bcb2dfaeSJed BrownIn addition to library and header, libCEED provides a [pkg-config](https://en.wikipedia.org/wiki/Pkg-config)
409bcb2dfaeSJed Brownfile that can be used to easily compile and link.
410bcb2dfaeSJed Brown[For example](https://people.freedesktop.org/~dbn/pkg-config-guide.html#faq), if
411bcb2dfaeSJed Brown`$prefix` is a standard location or you set the environment variable
412bcb2dfaeSJed Brown`PKG_CONFIG_PATH`:
413bcb2dfaeSJed Brown
414bcb2dfaeSJed Brown```
415bcb2dfaeSJed Browncc `pkg-config --cflags --libs ceed` -o myapp myapp.c
416bcb2dfaeSJed Brown```
417bcb2dfaeSJed Brown
418bcb2dfaeSJed Brownwill build `myapp` with libCEED.  This can be used with the source or
419bcb2dfaeSJed Browninstalled directories.  Most build systems have support for pkg-config.
420bcb2dfaeSJed Brown
421bcb2dfaeSJed Brown## Contact
422bcb2dfaeSJed Brown
423bcb2dfaeSJed BrownYou can reach the libCEED team by emailing [ceed-users@llnl.gov](mailto:ceed-users@llnl.gov)
424bcb2dfaeSJed Brownor by leaving a comment in the [issue tracker](https://github.com/CEED/libCEED/issues).
425bcb2dfaeSJed Brown
426bcb2dfaeSJed Brown## How to Cite
427bcb2dfaeSJed Brown
428bcb2dfaeSJed BrownIf you utilize libCEED please cite:
429bcb2dfaeSJed Brown
430bcb2dfaeSJed Brown```
431bcb2dfaeSJed Brown@article{libceed-joss-paper,
432bcb2dfaeSJed Brown  author       = {Jed Brown and Ahmad Abdelfattah and Valeria Barra and Natalie Beams and Jean Sylvain Camier and Veselin Dobrev and Yohann Dudouit and Leila Ghaffari and Tzanio Kolev and David Medina and Will Pazner and Thilina Ratnayaka and Jeremy Thompson and Stan Tomov},
433bcb2dfaeSJed Brown  title        = {{libCEED}: Fast algebra for high-order element-based discretizations},
434bcb2dfaeSJed Brown  journal      = {Journal of Open Source Software},
435bcb2dfaeSJed Brown  year         = {2021},
436bcb2dfaeSJed Brown  publisher    = {The Open Journal},
437bcb2dfaeSJed Brown  volume       = {6},
438bcb2dfaeSJed Brown  number       = {63},
439bcb2dfaeSJed Brown  pages        = {2945},
440bcb2dfaeSJed Brown  doi          = {10.21105/joss.02945}
441bcb2dfaeSJed Brown}
442bcb2dfaeSJed Brown
443bcb2dfaeSJed Brown@misc{libceed-user-manual,
444bcb2dfaeSJed Brown  author       = {Abdelfattah, Ahmad and
445bcb2dfaeSJed Brown                  Barra, Valeria and
446bcb2dfaeSJed Brown                  Beams, Natalie and
447bcb2dfaeSJed Brown                  Brown, Jed and
448bcb2dfaeSJed Brown                  Camier, Jean-Sylvain and
449bcb2dfaeSJed Brown                  Dobrev, Veselin and
450bcb2dfaeSJed Brown                  Dudouit, Yohann and
451bcb2dfaeSJed Brown                  Ghaffari, Leila and
452bcb2dfaeSJed Brown                  Kolev, Tzanio and
453bcb2dfaeSJed Brown                  Medina, David and
454bcb2dfaeSJed Brown                  Pazner, Will and
455bcb2dfaeSJed Brown                  Ratnayaka, Thilina and
456bcb2dfaeSJed Brown                  Thompson, Jeremy L and
457bcb2dfaeSJed Brown                  Tomov, Stanimire},
458bcb2dfaeSJed Brown  title        = {{libCEED} User Manual},
459bcb2dfaeSJed Brown  month        = jul,
460bcb2dfaeSJed Brown  year         = 2021,
461bcb2dfaeSJed Brown  publisher    = {Zenodo},
462bcb2dfaeSJed Brown  version      = {0.9.0},
463bcb2dfaeSJed Brown  doi          = {10.5281/zenodo.5077489}
464bcb2dfaeSJed Brown}
465bcb2dfaeSJed Brown```
466bcb2dfaeSJed Brown
467bcb2dfaeSJed BrownFor libCEED's Python interface please cite:
468bcb2dfaeSJed Brown
469bcb2dfaeSJed Brown```
470bcb2dfaeSJed Brown@InProceedings{libceed-paper-proc-scipy-2020,
471bcb2dfaeSJed Brown  author    = {{V}aleria {B}arra and {J}ed {B}rown and {J}eremy {T}hompson and {Y}ohann {D}udouit},
472bcb2dfaeSJed Brown  title     = {{H}igh-performance operator evaluations with ease of use: lib{C}{E}{E}{D}'s {P}ython interface},
473bcb2dfaeSJed Brown  booktitle = {{P}roceedings of the 19th {P}ython in {S}cience {C}onference},
474bcb2dfaeSJed Brown  pages     = {85 - 90},
475bcb2dfaeSJed Brown  year      = {2020},
476bcb2dfaeSJed Brown  editor    = {{M}eghann {A}garwal and {C}hris {C}alloway and {D}illon {N}iederhut and {D}avid {S}hupe},
477bcb2dfaeSJed Brown  doi       = {10.25080/Majora-342d178e-00c}
478bcb2dfaeSJed Brown}
479bcb2dfaeSJed Brown```
480bcb2dfaeSJed Brown
481bcb2dfaeSJed BrownThe BiBTeX entries for these references can be found in the
482bcb2dfaeSJed Brown`doc/bib/references.bib` file.
483bcb2dfaeSJed Brown
484bcb2dfaeSJed Brown## Copyright
485bcb2dfaeSJed Brown
486bcb2dfaeSJed BrownThe following copyright applies to each file in the CEED software suite, unless
487bcb2dfaeSJed Brownotherwise stated in the file:
488bcb2dfaeSJed Brown
489bcb2dfaeSJed Brown> Copyright (c) 2017, Lawrence Livermore National Security, LLC. Produced at the
490bcb2dfaeSJed Brown> Lawrence Livermore National Laboratory. LLNL-CODE-734707. All Rights reserved.
491bcb2dfaeSJed Brown
492bcb2dfaeSJed BrownSee files LICENSE and NOTICE for details.
493d3fde3fbSJed Brown
494d3fde3fbSJed Brown[github-badge]: https://github.com/CEED/libCEED/workflows/C/Fortran/badge.svg
495d3fde3fbSJed Brown[github-link]: https://github.com/CEED/libCEED/actions
496d3fde3fbSJed Brown[gitlab-badge]: https://gitlab.com/libceed/libCEED/badges/main/pipeline.svg?key_text=GitLab-CI
497d3fde3fbSJed Brown[gitlab-link]: https://gitlab.com/libceed/libCEED/-/pipelines?page=1&scope=all&ref=main
498d3fde3fbSJed Brown[codecov-badge]: https://codecov.io/gh/CEED/libCEED/branch/main/graphs/badge.svg
499d3fde3fbSJed Brown[codecov-link]: https://codecov.io/gh/CEED/libCEED/
500d3fde3fbSJed Brown[license-badge]: https://img.shields.io/badge/License-BSD%202--Clause-orange.svg
501d3fde3fbSJed Brown[license-link]: https://opensource.org/licenses/BSD-2-Clause
502d3fde3fbSJed Brown[doc-badge]: https://readthedocs.org/projects/libceed/badge/?version=latest
50313964f07SJed Brown[doc-link]: https://libceed.org/en/latest/?badge=latest
504d3fde3fbSJed Brown[joss-badge]: https://joss.theoj.org/papers/10.21105/joss.02945/status.svg
505d3fde3fbSJed Brown[joss-link]: https://doi.org/10.21105/joss.02945
506d3fde3fbSJed Brown[binder-badge]: http://mybinder.org/badge_logo.svg
5071bd2483cSJeremy L Thompson[binder-link]: https://mybinder.org/v2/gh/CEED/libCEED/main?urlpath=lab/tree/examples/python/tutorial-0-ceed.ipynb
508