1bcb2dfaeSJed Brown# libCEED: Efficient Extensible Discretization 2bcb2dfaeSJed Brown 3d3fde3fbSJed Brown[![GitHub Actions][github-badge]][github-link] 4d3fde3fbSJed Brown[![GitLab-CI][gitlab-badge]][gitlab-link] 5d3fde3fbSJed Brown[![Code coverage][codecov-badge]][codecov-link] 6d3fde3fbSJed Brown[![BSD-2-Clause][license-badge]][license-link] 7d3fde3fbSJed Brown[![Documentation][doc-badge]][doc-link] 8d3fde3fbSJed Brown[![JOSS paper][joss-badge]][joss-link] 9d3fde3fbSJed Brown[![Binder][binder-badge]][binder-link] 10bcb2dfaeSJed Brown 11bcb2dfaeSJed Brown## Summary and Purpose 12bcb2dfaeSJed Brown 1317be3a41SJeremy L ThompsonlibCEED provides fast algebra for element-based discretizations, designed for performance portability, run-time flexibility, and clean embedding in higher level libraries and applications. 1417be3a41SJeremy L ThompsonIt offers a C99 interface as well as bindings for Fortran, Python, Julia, and Rust. 1517be3a41SJeremy L ThompsonWhile our focus is on high-order finite elements, the approach is mostly algebraic and thus applicable to other discretizations in factored form, as explained in the [user manual](https://libceed.org/en/latest/) and API implementation portion of the [documentation](https://libceed.org/en/latest/api/). 16bcb2dfaeSJed Brown 1717be3a41SJeremy L ThompsonOne of the challenges with high-order methods is that a global sparse matrix is no longer a good representation of a high-order linear operator, both with respect to the FLOPs needed for its evaluation, as well as the memory transfer needed for a matvec. 1817be3a41SJeremy L ThompsonThus, high-order methods require a new "format" that still represents a linear (or more generally non-linear) operator, but not through a sparse matrix. 19bcb2dfaeSJed Brown 2017be3a41SJeremy L ThompsonThe goal of libCEED is to propose such a format, as well as supporting implementations and data structures, that enable efficient operator evaluation on a variety of computational device types (CPUs, GPUs, etc.). 2117be3a41SJeremy L ThompsonThis new operator description is based on algebraically [factored form](https://libceed.org/en/latest/libCEEDapi/#finite-element-operator-decomposition), which is easy to incorporate in a wide variety of applications, without significant refactoring of their own discretization infrastructure. 22bcb2dfaeSJed Brown 2317be3a41SJeremy L ThompsonThe repository is part of the [CEED software suite](http://ceed.exascaleproject.org/software/), a collection of software benchmarks, miniapps, libraries and APIs for efficient exascale discretizations based on high-order finite element and spectral element methods. 24bcb2dfaeSJed BrownSee <http://github.com/ceed> for more information and source code availability. 25bcb2dfaeSJed Brown 2617be3a41SJeremy L ThompsonThe CEED research is supported by the [Exascale Computing Project](https://exascaleproject.org/exascale-computing-project) (17-SC-20-SC), a collaborative effort of two U.S. Department of Energy organizations (Office of Science and the National Nuclear Security Administration) responsible for the planning and preparation of a [capable exascale ecosystem](https://exascaleproject.org/what-is-exascale), including software, applications, hardware, advanced system engineering and early testbed platforms, in support of the nation’s exascale computing imperative. 27bcb2dfaeSJed Brown 2813964f07SJed BrownFor more details on the CEED API see the [user manual](https://libceed.org/en/latest/). 29bcb2dfaeSJed Brown 30bcb2dfaeSJed Brown% gettingstarted-inclusion-marker 31bcb2dfaeSJed Brown 32bcb2dfaeSJed Brown## Building 33bcb2dfaeSJed Brown 3417be3a41SJeremy L ThompsonThe CEED library, `libceed`, is a C99 library with no required dependencies, and with Fortran, Python, Julia, and Rust interfaces. 3517be3a41SJeremy L ThompsonIt can be built using: 36bcb2dfaeSJed Brown 37b648fd31SJed Brown```console 38b648fd31SJed Brown$ make 39bcb2dfaeSJed Brown``` 40bcb2dfaeSJed Brown 41bcb2dfaeSJed Brownor, with optimization flags: 42bcb2dfaeSJed Brown 43b648fd31SJed Brown```console 44b648fd31SJed Brown$ make OPT='-O3 -march=skylake-avx512 -ffp-contract=fast' 45bcb2dfaeSJed Brown``` 46bcb2dfaeSJed Brown 4717be3a41SJeremy L ThompsonThese optimization flags are used by all languages (C, C++, Fortran) and this makefile variable can also be set for testing and examples (below). 48bcb2dfaeSJed Brown 4917be3a41SJeremy L ThompsonThe library attempts to automatically detect support for the AVX instruction set using gcc-style compiler options for the host. 50bcb2dfaeSJed BrownSupport may need to be manually specified via: 51bcb2dfaeSJed Brown 52b648fd31SJed Brown```console 53b648fd31SJed Brown$ make AVX=1 54bcb2dfaeSJed Brown``` 55bcb2dfaeSJed Brown 56bcb2dfaeSJed Brownor: 57bcb2dfaeSJed Brown 58b648fd31SJed Brown```console 59b648fd31SJed Brown$ make AVX=0 60bcb2dfaeSJed Brown``` 61bcb2dfaeSJed Brown 6217be3a41SJeremy L Thompsonif your compiler does not support gcc-style options, if you are cross compiling, etc. 63bcb2dfaeSJed Brown 6417be3a41SJeremy L ThompsonTo enable CUDA support, add `CUDA_DIR=/opt/cuda` or an appropriate directory to your `make` invocation. 65023b8a51Sabdelfattah83To enable HIP support, add `ROCM_DIR=/opt/rocm` or an appropriate directory. 66bd882c8aSJames WrightTo enable SYCL support, add `SYCL_DIR=/opt/sycl` or an appropriate directory. 67bd882c8aSJames WrightNote that SYCL backends require building with oneAPI compilers as well: 68bd882c8aSJames Wright```console 69bd882c8aSJames Wright$ . /opt/intel/oneapi/setvars.sh 70bd882c8aSJames Wright$ make SYCL_DIR=/opt/intel/oneapi/compiler/latest/linux SYCLCXX=icpx CC=icx CXX=icpx 71bd882c8aSJames Wright``` 72bd882c8aSJames Wright 7317be3a41SJeremy L ThompsonTo store these or other arguments as defaults for future invocations of `make`, use: 74bcb2dfaeSJed Brown 75b648fd31SJed Brown```console 76023b8a51Sabdelfattah83$ make configure CUDA_DIR=/usr/local/cuda ROCM_DIR=/opt/rocm OPT='-O3 -march=znver2' 77bcb2dfaeSJed Brown``` 78bcb2dfaeSJed Brown 79bcb2dfaeSJed Brownwhich stores these variables in `config.mk`. 80bcb2dfaeSJed Brown 81b648fd31SJed Brown### WebAssembly 82b648fd31SJed Brown 83b648fd31SJed BrownlibCEED can be built for WASM using [Emscripten](https://emscripten.org). For example, one can build the library and run a standalone WASM executable using 84b648fd31SJed Brown 85b648fd31SJed Brown``` console 86b648fd31SJed Brown$ emmake make build/ex2-surface.wasm 87b648fd31SJed Brown$ wasmer build/ex2-surface.wasm -- -s 200000 88b648fd31SJed Brown``` 89b648fd31SJed Brown 90bcb2dfaeSJed Brown## Additional Language Interfaces 91bcb2dfaeSJed Brown 92bcb2dfaeSJed BrownThe Fortran interface is built alongside the library automatically. 93bcb2dfaeSJed Brown 94bcb2dfaeSJed BrownPython users can install using: 95bcb2dfaeSJed Brown 96b648fd31SJed Brown```console 97b648fd31SJed Brown$ pip install libceed 98bcb2dfaeSJed Brown``` 99bcb2dfaeSJed Brown 100bcb2dfaeSJed Brownor in a clone of the repository via `pip install .`. 101bcb2dfaeSJed Brown 102bcb2dfaeSJed BrownJulia users can install using: 103bcb2dfaeSJed Brown 104b648fd31SJed Brown```console 105bcb2dfaeSJed Brown$ julia 106bcb2dfaeSJed Brownjulia> ] 107bcb2dfaeSJed Brownpkg> add LibCEED 108bcb2dfaeSJed Brown``` 109bcb2dfaeSJed Brown 11017be3a41SJeremy L ThompsonSee the [LibCEED.jl documentation](http://ceed.exascaleproject.org/libCEED-julia-docs/dev/) for more information. 111bcb2dfaeSJed Brown 112bcb2dfaeSJed BrownRust users can include libCEED via `Cargo.toml`: 113bcb2dfaeSJed Brown 114bcb2dfaeSJed Brown```toml 115bcb2dfaeSJed Brown[dependencies] 1168ec64e9aSJed Brownlibceed = "0.11.0" 117bcb2dfaeSJed Brown``` 118bcb2dfaeSJed Brown 119bcb2dfaeSJed BrownSee the [Cargo documentation](https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html#specifying-dependencies-from-git-repositories) for details. 120bcb2dfaeSJed Brown 121bcb2dfaeSJed Brown## Testing 122bcb2dfaeSJed Brown 123bcb2dfaeSJed BrownThe test suite produces [TAP](https://testanything.org) output and is run by: 124bcb2dfaeSJed Brown 125f11332b8SJeremy L Thompson```console 126f11332b8SJeremy L Thompson$ make test 127bcb2dfaeSJed Brown``` 128bcb2dfaeSJed Brown 129bcb2dfaeSJed Brownor, using the `prove` tool distributed with Perl (recommended): 130bcb2dfaeSJed Brown 131f11332b8SJeremy L Thompson```console 132f11332b8SJeremy L Thompson$ make prove 133bcb2dfaeSJed Brown``` 134bcb2dfaeSJed Brown 135bcb2dfaeSJed Brown## Backends 136bcb2dfaeSJed Brown 137bcb2dfaeSJed BrownThere are multiple supported backends, which can be selected at runtime in the examples: 138bcb2dfaeSJed Brown 139bcb2dfaeSJed Brown| CEED resource | Backend | Deterministic Capable | 140d3fde3fbSJed Brown| :--- | :--- | :---: | 141d3fde3fbSJed Brown|| 142d3fde3fbSJed Brown| **CPU Native** | 143d3fde3fbSJed Brown| `/cpu/self/ref/serial` | Serial reference implementation | Yes | 144d3fde3fbSJed Brown| `/cpu/self/ref/blocked` | Blocked reference implementation | Yes | 145d3fde3fbSJed Brown| `/cpu/self/opt/serial` | Serial optimized C implementation | Yes | 146d3fde3fbSJed Brown| `/cpu/self/opt/blocked` | Blocked optimized C implementation | Yes | 147d3fde3fbSJed Brown| `/cpu/self/avx/serial` | Serial AVX implementation | Yes | 148d3fde3fbSJed Brown| `/cpu/self/avx/blocked` | Blocked AVX implementation | Yes | 149d3fde3fbSJed Brown|| 150d3fde3fbSJed Brown| **CPU Valgrind** | 151d3fde3fbSJed Brown| `/cpu/self/memcheck/*` | Memcheck backends, undefined value checks | Yes | 152d3fde3fbSJed Brown|| 153d3fde3fbSJed Brown| **CPU LIBXSMM** | 154d3fde3fbSJed Brown| `/cpu/self/xsmm/serial` | Serial LIBXSMM implementation | Yes | 155d3fde3fbSJed Brown| `/cpu/self/xsmm/blocked` | Blocked LIBXSMM implementation | Yes | 156d3fde3fbSJed Brown|| 157d3fde3fbSJed Brown| **CUDA Native** | 158d3fde3fbSJed Brown| `/gpu/cuda/ref` | Reference pure CUDA kernels | Yes | 159d3fde3fbSJed Brown| `/gpu/cuda/shared` | Optimized pure CUDA kernels using shared memory | Yes | 160d3fde3fbSJed Brown| `/gpu/cuda/gen` | Optimized pure CUDA kernels using code generation | No | 161d3fde3fbSJed Brown|| 162d3fde3fbSJed Brown| **HIP Native** | 163d3fde3fbSJed Brown| `/gpu/hip/ref` | Reference pure HIP kernels | Yes | 164d3fde3fbSJed Brown| `/gpu/hip/shared` | Optimized pure HIP kernels using shared memory | Yes | 165d3fde3fbSJed Brown| `/gpu/hip/gen` | Optimized pure HIP kernels using code generation | No | 166d3fde3fbSJed Brown|| 167bd882c8aSJames Wright| **SYCL Native** | 168bd882c8aSJames Wright| `/gpu/sycl/ref` | Reference pure SYCL kernels | Yes | 169bd882c8aSJames Wright| `/gpu/sycl/shared` | Optimized pure SYCL kernels using shared memory | Yes | 170bd882c8aSJames Wright|| 171d3fde3fbSJed Brown| **MAGMA** | 172d3fde3fbSJed Brown| `/gpu/cuda/magma` | CUDA MAGMA kernels | No | 173d3fde3fbSJed Brown| `/gpu/cuda/magma/det` | CUDA MAGMA kernels | Yes | 174d3fde3fbSJed Brown| `/gpu/hip/magma` | HIP MAGMA kernels | No | 175d3fde3fbSJed Brown| `/gpu/hip/magma/det` | HIP MAGMA kernels | Yes | 176d3fde3fbSJed Brown|| 177d3fde3fbSJed Brown| **OCCA** | 178d3fde3fbSJed Brown| `/*/occa` | Selects backend based on available OCCA modes | Yes | 179d3fde3fbSJed Brown| `/cpu/self/occa` | OCCA backend with serial CPU kernels | Yes | 180d3fde3fbSJed Brown| `/cpu/openmp/occa` | OCCA backend with OpenMP kernels | Yes | 181bd882c8aSJames Wright| `/cpu/dpcpp/occa` | OCCA backend with DPC++ kernels | Yes | 182d3fde3fbSJed Brown| `/gpu/cuda/occa` | OCCA backend with CUDA kernels | Yes | 18382b36727SJeremy L Thompson| `/gpu/hip/occa` | OCCA backend with HIP kernels | Yes | 184bcb2dfaeSJed Brown 18517be3a41SJeremy L ThompsonThe `/cpu/self/*/serial` backends process one element at a time and are intended for meshes with a smaller number of high order elements. 18617be3a41SJeremy L ThompsonThe `/cpu/self/*/blocked` backends process blocked batches of eight interlaced elements and are intended for meshes with higher numbers of elements. 187bcb2dfaeSJed Brown 188bcb2dfaeSJed BrownThe `/cpu/self/ref/*` backends are written in pure C and provide basic functionality. 189bcb2dfaeSJed Brown 190bcb2dfaeSJed BrownThe `/cpu/self/opt/*` backends are written in pure C and use partial e-vectors to improve performance. 191bcb2dfaeSJed Brown 192bcb2dfaeSJed BrownThe `/cpu/self/avx/*` backends rely upon AVX instructions to provide vectorized CPU performance. 193bcb2dfaeSJed Brown 194*f0b1efffSJeremy L ThompsonThe `/cpu/self/memcheck/*` backends rely upon the [Valgrind](https://valgrind.org/) Memcheck tool to help verify that user QFunctions have no undefined values. 19517be3a41SJeremy L ThompsonTo use, run your code with Valgrind and the Memcheck backends, e.g. `valgrind ./build/ex1 -ceed /cpu/self/ref/memcheck`. 19617be3a41SJeremy L ThompsonA 'development' or 'debugging' version of Valgrind with headers is required to use this backend. 19717be3a41SJeremy L ThompsonThis backend can be run in serial or blocked mode and defaults to running in the serial mode if `/cpu/self/memcheck` is selected at runtime. 198bcb2dfaeSJed Brown 199*f0b1efffSJeremy L ThompsonThe `/cpu/self/xsmm/*` backends rely upon the [LIBXSMM](https://github.com/libxsmm/libxsmm) package to provide vectorized CPU performance. 20017be3a41SJeremy L ThompsonIf linking MKL and LIBXSMM is desired but the Makefile is not detecting `MKLROOT`, linking libCEED against MKL can be forced by setting the environment variable `MKL=1`. 201bcb2dfaeSJed Brown 202bcb2dfaeSJed BrownThe `/gpu/cuda/*` backends provide GPU performance strictly using CUDA. 203bcb2dfaeSJed Brown 20417be3a41SJeremy L ThompsonThe `/gpu/hip/*` backends provide GPU performance strictly using HIP. 20517be3a41SJeremy L ThompsonThey are based on the `/gpu/cuda/*` backends. 20617be3a41SJeremy L ThompsonROCm version 4.2 or newer is required. 207bcb2dfaeSJed Brown 208bd882c8aSJames WrightThe `/gpu/sycl/*` backends provide GPU performance strictly using SYCL. 209bd882c8aSJames WrightThey are based on the `/gpu/cuda/*` and `/gpu/hip/*` backends. 210bd882c8aSJames Wright 211bcb2dfaeSJed BrownThe `/gpu/*/magma/*` backends rely upon the [MAGMA](https://bitbucket.org/icl/magma) package. 21217be3a41SJeremy L ThompsonTo enable the MAGMA backends, the environment variable `MAGMA_DIR` must point to the top-level MAGMA directory, with the MAGMA library located in `$(MAGMA_DIR)/lib/`. 21317be3a41SJeremy L ThompsonBy default, `MAGMA_DIR` is set to `../magma`; to build the MAGMA backends with a MAGMA installation located elsewhere, create a link to `magma/` in libCEED's parent directory, or set `MAGMA_DIR` to the proper location. 21417be3a41SJeremy L ThompsonMAGMA version 2.5.0 or newer is required. 21517be3a41SJeremy L ThompsonCurrently, each MAGMA library installation is only built for either CUDA or HIP. 21617be3a41SJeremy L ThompsonThe corresponding set of libCEED backends (`/gpu/cuda/magma/*` or `/gpu/hip/magma/*`) will automatically be built for the version of the MAGMA library found in `MAGMA_DIR`. 217bcb2dfaeSJed Brown 21817be3a41SJeremy L ThompsonUsers can specify a device for all CUDA, HIP, and MAGMA backends through adding `:device_id=#` after the resource name. 21917be3a41SJeremy L ThompsonFor example: 220bcb2dfaeSJed Brown 221bcb2dfaeSJed Brown> - `/gpu/cuda/gen:device_id=1` 222bcb2dfaeSJed Brown 22317be3a41SJeremy L ThompsonThe `/*/occa` backends rely upon the [OCCA](http://github.com/libocca/occa) package to provide cross platform performance. 22417be3a41SJeremy L ThompsonTo enable the OCCA backend, the environment variable `OCCA_DIR` must point to the top-level OCCA directory, with the OCCA library located in the `${OCCA_DIR}/lib` (By default, `OCCA_DIR` is set to `../occa`). 2250be03a92SJeremy L ThompsonOCCA version 1.4.0 or newer is required. 226bcb2dfaeSJed Brown 2270be03a92SJeremy L ThompsonUsers can pass specific OCCA device properties after setting the CEED resource. 228bcb2dfaeSJed BrownFor example: 229bcb2dfaeSJed Brown 230bcb2dfaeSJed Brown> - `"/*/occa:mode='CUDA',device_id=0"` 231bcb2dfaeSJed Brown 232bcb2dfaeSJed BrownBit-for-bit reproducibility is important in some applications. 233bcb2dfaeSJed BrownHowever, some libCEED backends use non-deterministic operations, such as `atomicAdd` for increased performance. 234bcb2dfaeSJed BrownThe backends which are capable of generating reproducible results, with the proper compilation options, are highlighted in the list above. 235bcb2dfaeSJed Brown 236bcb2dfaeSJed Brown## Examples 237bcb2dfaeSJed Brown 23817be3a41SJeremy L ThompsonlibCEED comes with several examples of its usage, ranging from standalone C codes in the `/examples/ceed` directory to examples based on external packages, such as MFEM, PETSc, and Nek5000. 23917be3a41SJeremy L ThompsonNek5000 v18.0 or greater is required. 240bcb2dfaeSJed Brown 24117be3a41SJeremy L ThompsonTo build the examples, set the `MFEM_DIR`, `PETSC_DIR`, and `NEK5K_DIR` variables and run: 242bcb2dfaeSJed Brown 243b648fd31SJed Brown```console 244b648fd31SJed Brown$ cd examples/ 245bcb2dfaeSJed Brown``` 246bcb2dfaeSJed Brown 247bcb2dfaeSJed Brown% running-examples-inclusion-marker 248bcb2dfaeSJed Brown 249bcb2dfaeSJed Brown```console 250bcb2dfaeSJed Brown# libCEED examples on CPU and GPU 251b648fd31SJed Brown$ cd ceed/ 252b648fd31SJed Brown$ make 253b648fd31SJed Brown$ ./ex1-volume -ceed /cpu/self 254b648fd31SJed Brown$ ./ex1-volume -ceed /gpu/cuda 255b648fd31SJed Brown$ ./ex2-surface -ceed /cpu/self 256b648fd31SJed Brown$ ./ex2-surface -ceed /gpu/cuda 257b648fd31SJed Brown$ cd .. 258bcb2dfaeSJed Brown 259bcb2dfaeSJed Brown# MFEM+libCEED examples on CPU and GPU 260b648fd31SJed Brown$ cd mfem/ 261b648fd31SJed Brown$ make 262b648fd31SJed Brown$ ./bp1 -ceed /cpu/self -no-vis 263b648fd31SJed Brown$ ./bp3 -ceed /gpu/cuda -no-vis 264b648fd31SJed Brown$ cd .. 265bcb2dfaeSJed Brown 266bcb2dfaeSJed Brown# Nek5000+libCEED examples on CPU and GPU 267b648fd31SJed Brown$ cd nek/ 268b648fd31SJed Brown$ make 269b648fd31SJed Brown$ ./nek-examples.sh -e bp1 -ceed /cpu/self -b 3 270b648fd31SJed Brown$ ./nek-examples.sh -e bp3 -ceed /gpu/cuda -b 3 271b648fd31SJed Brown$ cd .. 272bcb2dfaeSJed Brown 273bcb2dfaeSJed Brown# PETSc+libCEED examples on CPU and GPU 274b648fd31SJed Brown$ cd petsc/ 275b648fd31SJed Brown$ make 276b648fd31SJed Brown$ ./bps -problem bp1 -ceed /cpu/self 277b648fd31SJed Brown$ ./bps -problem bp2 -ceed /gpu/cuda 278b648fd31SJed Brown$ ./bps -problem bp3 -ceed /cpu/self 279b648fd31SJed Brown$ ./bps -problem bp4 -ceed /gpu/cuda 280b648fd31SJed Brown$ ./bps -problem bp5 -ceed /cpu/self 281b648fd31SJed Brown$ ./bps -problem bp6 -ceed /gpu/cuda 282b648fd31SJed Brown$ cd .. 283bcb2dfaeSJed Brown 284b648fd31SJed Brown$ cd petsc/ 285b648fd31SJed Brown$ make 286b648fd31SJed Brown$ ./bpsraw -problem bp1 -ceed /cpu/self 287b648fd31SJed Brown$ ./bpsraw -problem bp2 -ceed /gpu/cuda 288b648fd31SJed Brown$ ./bpsraw -problem bp3 -ceed /cpu/self 289b648fd31SJed Brown$ ./bpsraw -problem bp4 -ceed /gpu/cuda 290b648fd31SJed Brown$ ./bpsraw -problem bp5 -ceed /cpu/self 291b648fd31SJed Brown$ ./bpsraw -problem bp6 -ceed /gpu/cuda 292b648fd31SJed Brown$ cd .. 293bcb2dfaeSJed Brown 294b648fd31SJed Brown$ cd petsc/ 295b648fd31SJed Brown$ make 296b648fd31SJed Brown$ ./bpssphere -problem bp1 -ceed /cpu/self 297b648fd31SJed Brown$ ./bpssphere -problem bp2 -ceed /gpu/cuda 298b648fd31SJed Brown$ ./bpssphere -problem bp3 -ceed /cpu/self 299b648fd31SJed Brown$ ./bpssphere -problem bp4 -ceed /gpu/cuda 300b648fd31SJed Brown$ ./bpssphere -problem bp5 -ceed /cpu/self 301b648fd31SJed Brown$ ./bpssphere -problem bp6 -ceed /gpu/cuda 302b648fd31SJed Brown$ cd .. 303bcb2dfaeSJed Brown 304b648fd31SJed Brown$ cd petsc/ 305b648fd31SJed Brown$ make 306b648fd31SJed Brown$ ./area -problem cube -ceed /cpu/self -degree 3 307b648fd31SJed Brown$ ./area -problem cube -ceed /gpu/cuda -degree 3 308b648fd31SJed Brown$ ./area -problem sphere -ceed /cpu/self -degree 3 -dm_refine 2 309b648fd31SJed Brown$ ./area -problem sphere -ceed /gpu/cuda -degree 3 -dm_refine 2 310bcb2dfaeSJed Brown 311b648fd31SJed Brown$ cd fluids/ 312b648fd31SJed Brown$ make 313b648fd31SJed Brown$ ./navierstokes -ceed /cpu/self -degree 1 314b648fd31SJed Brown$ ./navierstokes -ceed /gpu/cuda -degree 1 315b648fd31SJed Brown$ cd .. 316bcb2dfaeSJed Brown 317b648fd31SJed Brown$ cd solids/ 318b648fd31SJed Brown$ make 319b648fd31SJed Brown$ ./elasticity -ceed /cpu/self -mesh [.exo file] -degree 2 -E 1 -nu 0.3 -problem Linear -forcing mms 320b648fd31SJed Brown$ ./elasticity -ceed /gpu/cuda -mesh [.exo file] -degree 2 -E 1 -nu 0.3 -problem Linear -forcing mms 321b648fd31SJed Brown$ cd .. 322bcb2dfaeSJed Brown``` 323bcb2dfaeSJed Brown 32417be3a41SJeremy L ThompsonFor the last example shown, sample meshes to be used in place of `[.exo file]` can be found at <https://github.com/jeremylt/ceedSampleMeshes> 325bcb2dfaeSJed Brown 32617be3a41SJeremy L ThompsonThe above code assumes a GPU-capable machine with the CUDA backends enabled. 32717be3a41SJeremy L ThompsonDepending on the available backends, other CEED resource specifiers can be provided with the `-ceed` option. 32817be3a41SJeremy L ThompsonOther command line arguments can be found in [examples/petsc](https://github.com/CEED/libCEED/blob/main/examples/petsc/README.md). 329bcb2dfaeSJed Brown 330bcb2dfaeSJed Brown% benchmarks-marker 331bcb2dfaeSJed Brown 332bcb2dfaeSJed Brown## Benchmarks 333bcb2dfaeSJed Brown 334bcb2dfaeSJed BrownA sequence of benchmarks for all enabled backends can be run using: 335bcb2dfaeSJed Brown 336b648fd31SJed Brown```console 337b648fd31SJed Brown$ make benchmarks 338bcb2dfaeSJed Brown``` 339bcb2dfaeSJed Brown 34017be3a41SJeremy L ThompsonThe results from the benchmarks are stored inside the `benchmarks/` directory and they can be viewed using the commands (requires python with matplotlib): 341bcb2dfaeSJed Brown 342b648fd31SJed Brown```console 343b648fd31SJed Brown$ cd benchmarks 344b648fd31SJed Brown$ python postprocess-plot.py petsc-bps-bp1-*-output.txt 345b648fd31SJed Brown$ python postprocess-plot.py petsc-bps-bp3-*-output.txt 346bcb2dfaeSJed Brown``` 347bcb2dfaeSJed Brown 34817be3a41SJeremy L ThompsonUsing the `benchmarks` target runs a comprehensive set of benchmarks which may take some time to run. 34917be3a41SJeremy L ThompsonSubsets of the benchmarks can be run using the scripts in the `benchmarks` folder. 350bcb2dfaeSJed Brown 351bcb2dfaeSJed BrownFor more details about the benchmarks, see the `benchmarks/README.md` file. 352bcb2dfaeSJed Brown 353bcb2dfaeSJed Brown## Install 354bcb2dfaeSJed Brown 355bcb2dfaeSJed BrownTo install libCEED, run: 356bcb2dfaeSJed Brown 357b648fd31SJed Brown```console 358b648fd31SJed Brown$ make install prefix=/path/to/install/dir 359bcb2dfaeSJed Brown``` 360bcb2dfaeSJed Brown 361bcb2dfaeSJed Brownor (e.g., if creating packages): 362bcb2dfaeSJed Brown 363b648fd31SJed Brown```console 364b648fd31SJed Brown$ make install prefix=/usr DESTDIR=/packaging/path 365bcb2dfaeSJed Brown``` 366bcb2dfaeSJed Brown 367d27ed4f3SJeremy L ThompsonTo build and install in separate steps, run: 368d27ed4f3SJeremy L Thompson 369b648fd31SJed Brown```console 370b648fd31SJed Brown$ make for_install=1 prefix=/path/to/install/dir 371b648fd31SJed Brown$ make install prefix=/path/to/install/dir 372d27ed4f3SJeremy L Thompson``` 373d27ed4f3SJeremy L Thompson 37417be3a41SJeremy L ThompsonThe usual variables like `CC` and `CFLAGS` are used, and optimization flags for all languages can be set using the likes of `OPT='-O3 -march=native'`. 37517be3a41SJeremy L ThompsonUse `STATIC=1` to build static libraries (`libceed.a`). 376bcb2dfaeSJed Brown 377bcb2dfaeSJed BrownTo install libCEED for Python, run: 378bcb2dfaeSJed Brown 379b648fd31SJed Brown```console 380b648fd31SJed Brown$ pip install libceed 381bcb2dfaeSJed Brown``` 382bcb2dfaeSJed Brown 383bcb2dfaeSJed Brownwith the desired setuptools options, such as `--user`. 384bcb2dfaeSJed Brown 385bcb2dfaeSJed Brown### pkg-config 386bcb2dfaeSJed Brown 38717be3a41SJeremy L ThompsonIn addition to library and header, libCEED provides a [pkg-config](https://en.wikipedia.org/wiki/Pkg-config) file that can be used to easily compile and link. 38817be3a41SJeremy L Thompson[For example](https://people.freedesktop.org/~dbn/pkg-config-guide.html#faq), if `$prefix` is a standard location or you set the environment variable `PKG_CONFIG_PATH`: 389bcb2dfaeSJed Brown 390b648fd31SJed Brown```console 391b648fd31SJed Brown$ cc `pkg-config --cflags --libs ceed` -o myapp myapp.c 392bcb2dfaeSJed Brown``` 393bcb2dfaeSJed Brown 39417be3a41SJeremy L Thompsonwill build `myapp` with libCEED. 39517be3a41SJeremy L ThompsonThis can be used with the source or installed directories. 39617be3a41SJeremy L ThompsonMost build systems have support for pkg-config. 397bcb2dfaeSJed Brown 398bcb2dfaeSJed Brown## Contact 399bcb2dfaeSJed Brown 40017be3a41SJeremy L ThompsonYou can reach the libCEED team by emailing [ceed-users@llnl.gov](mailto:ceed-users@llnl.gov) or by leaving a comment in the [issue tracker](https://github.com/CEED/libCEED/issues). 401bcb2dfaeSJed Brown 402bcb2dfaeSJed Brown## How to Cite 403bcb2dfaeSJed Brown 404bcb2dfaeSJed BrownIf you utilize libCEED please cite: 405bcb2dfaeSJed Brown 406b648fd31SJed Brown```bibtex 407bcb2dfaeSJed Brown@article{libceed-joss-paper, 408bcb2dfaeSJed Brown author = {Jed Brown and Ahmad Abdelfattah and Valeria Barra and Natalie Beams and Jean Sylvain Camier and Veselin Dobrev and Yohann Dudouit and Leila Ghaffari and Tzanio Kolev and David Medina and Will Pazner and Thilina Ratnayaka and Jeremy Thompson and Stan Tomov}, 409bcb2dfaeSJed Brown title = {{libCEED}: Fast algebra for high-order element-based discretizations}, 410bcb2dfaeSJed Brown journal = {Journal of Open Source Software}, 411bcb2dfaeSJed Brown year = {2021}, 412bcb2dfaeSJed Brown publisher = {The Open Journal}, 413bcb2dfaeSJed Brown volume = {6}, 414bcb2dfaeSJed Brown number = {63}, 415bcb2dfaeSJed Brown pages = {2945}, 416bcb2dfaeSJed Brown doi = {10.21105/joss.02945} 417bcb2dfaeSJed Brown} 41887a4ead5SJeremy L Thompson``` 419bcb2dfaeSJed Brown 42087a4ead5SJeremy L ThompsonThe archival copy of the libCEED user manual is maintained on [Zenodo](https://doi.org/10.5281/zenodo.4302736). 42187a4ead5SJeremy L ThompsonTo cite the user manual: 42287a4ead5SJeremy L Thompson 42387a4ead5SJeremy L Thompson```bibtex 424bcb2dfaeSJed Brown@misc{libceed-user-manual, 425bcb2dfaeSJed Brown author = {Abdelfattah, Ahmad and 426bcb2dfaeSJed Brown Barra, Valeria and 427bcb2dfaeSJed Brown Beams, Natalie and 428bcb2dfaeSJed Brown Brown, Jed and 429bcb2dfaeSJed Brown Camier, Jean-Sylvain and 430bcb2dfaeSJed Brown Dobrev, Veselin and 431bcb2dfaeSJed Brown Dudouit, Yohann and 432bcb2dfaeSJed Brown Ghaffari, Leila and 433bcb2dfaeSJed Brown Kolev, Tzanio and 434bcb2dfaeSJed Brown Medina, David and 435bcb2dfaeSJed Brown Pazner, Will and 436bcb2dfaeSJed Brown Ratnayaka, Thilina and 437a85b61d6SJeremy L Thompson Shakeri, Rezgar and 438bcb2dfaeSJed Brown Thompson, Jeremy L and 439a85b61d6SJeremy L Thompson Tomov, Stanimire and 440a85b61d6SJeremy L Thompson Wright III, James}, 441bcb2dfaeSJed Brown title = {{libCEED} User Manual}, 442a85b61d6SJeremy L Thompson month = dec, 443a85b61d6SJeremy L Thompson year = 2022, 444bcb2dfaeSJed Brown publisher = {Zenodo}, 445a85b61d6SJeremy L Thompson version = {0.11.0}, 446a85b61d6SJeremy L Thompson doi = {10.5281/zenodo.7480454} 447bcb2dfaeSJed Brown} 448bcb2dfaeSJed Brown``` 449bcb2dfaeSJed Brown 450bcb2dfaeSJed BrownFor libCEED's Python interface please cite: 451bcb2dfaeSJed Brown 452b648fd31SJed Brown```bibtex 453bcb2dfaeSJed Brown@InProceedings{libceed-paper-proc-scipy-2020, 454bcb2dfaeSJed Brown author = {{V}aleria {B}arra and {J}ed {B}rown and {J}eremy {T}hompson and {Y}ohann {D}udouit}, 455bcb2dfaeSJed Brown title = {{H}igh-performance operator evaluations with ease of use: lib{C}{E}{E}{D}'s {P}ython interface}, 456bcb2dfaeSJed Brown booktitle = {{P}roceedings of the 19th {P}ython in {S}cience {C}onference}, 457bcb2dfaeSJed Brown pages = {85 - 90}, 458bcb2dfaeSJed Brown year = {2020}, 459bcb2dfaeSJed Brown editor = {{M}eghann {A}garwal and {C}hris {C}alloway and {D}illon {N}iederhut and {D}avid {S}hupe}, 460bcb2dfaeSJed Brown doi = {10.25080/Majora-342d178e-00c} 461bcb2dfaeSJed Brown} 462bcb2dfaeSJed Brown``` 463bcb2dfaeSJed Brown 464b648fd31SJed BrownThe BibTeX entries for these references can be found in the `doc/bib/references.bib` file. 465bcb2dfaeSJed Brown 466bcb2dfaeSJed Brown## Copyright 467bcb2dfaeSJed Brown 46817be3a41SJeremy L ThompsonThe following copyright applies to each file in the CEED software suite, unless otherwise stated in the file: 469bcb2dfaeSJed Brown 47083b45269SJeremy L Thompson> Copyright (c) 2017-2023, Lawrence Livermore National Security, LLC and other CEED contributors. 47183b45269SJeremy L Thompson> All rights reserved. 472bcb2dfaeSJed Brown 473bcb2dfaeSJed BrownSee files LICENSE and NOTICE for details. 474d3fde3fbSJed Brown 475d3fde3fbSJed Brown[github-badge]: https://github.com/CEED/libCEED/workflows/C/Fortran/badge.svg 476d3fde3fbSJed Brown[github-link]: https://github.com/CEED/libCEED/actions 477d3fde3fbSJed Brown[gitlab-badge]: https://gitlab.com/libceed/libCEED/badges/main/pipeline.svg?key_text=GitLab-CI 478d3fde3fbSJed Brown[gitlab-link]: https://gitlab.com/libceed/libCEED/-/pipelines?page=1&scope=all&ref=main 479d3fde3fbSJed Brown[codecov-badge]: https://codecov.io/gh/CEED/libCEED/branch/main/graphs/badge.svg 480d3fde3fbSJed Brown[codecov-link]: https://codecov.io/gh/CEED/libCEED/ 481d3fde3fbSJed Brown[license-badge]: https://img.shields.io/badge/License-BSD%202--Clause-orange.svg 482d3fde3fbSJed Brown[license-link]: https://opensource.org/licenses/BSD-2-Clause 483d3fde3fbSJed Brown[doc-badge]: https://readthedocs.org/projects/libceed/badge/?version=latest 48413964f07SJed Brown[doc-link]: https://libceed.org/en/latest/?badge=latest 485d3fde3fbSJed Brown[joss-badge]: https://joss.theoj.org/papers/10.21105/joss.02945/status.svg 486d3fde3fbSJed Brown[joss-link]: https://doi.org/10.21105/joss.02945 487d3fde3fbSJed Brown[binder-badge]: http://mybinder.org/badge_logo.svg 4881bd2483cSJeremy L Thompson[binder-link]: https://mybinder.org/v2/gh/CEED/libCEED/main?urlpath=lab/tree/examples/python/tutorial-0-ceed.ipynb 489