README.md (0a1d75a00eef2c2b2c9cdfbd3bcf319dba0408f2) README.md (84a01de5ce080ac9cdd243d9d64da2df0ae9cb77)
1# libCEED: the CEED API Library
2
3[![Build Status](https://travis-ci.org/CEED/libCEED.svg?branch=master)](https://travis-ci.org/CEED/libCEED)
4[![Code Coverage](https://codecov.io/gh/CEED/libCEED/branch/master/graphs/badge.svg)](https://codecov.io/gh/CEED/libCEED/)
5[![License](https://img.shields.io/badge/License-BSD%202--Clause-orange.svg)](https://opensource.org/licenses/BSD-2-Clause)
6[![Doxygen](https://codedocs.xyz/CEED/libCEED.svg)](https://codedocs.xyz/CEED/libCEED/)
7
8## Code for Efficient Extensible Discretization

--- 74 unchanged lines hidden (view full) ---

83## Backends
84
85There are multiple supported backends, which can be selected at runtime in the examples:
86
87| CEED resource | Backend |
88| :----------------------- | :------------------------------------------------ |
89| `/cpu/self/ref/serial` | Serial reference implementation |
90| `/cpu/self/ref/blocked` | Blocked refrence implementation |
1# libCEED: the CEED API Library
2
3[![Build Status](https://travis-ci.org/CEED/libCEED.svg?branch=master)](https://travis-ci.org/CEED/libCEED)
4[![Code Coverage](https://codecov.io/gh/CEED/libCEED/branch/master/graphs/badge.svg)](https://codecov.io/gh/CEED/libCEED/)
5[![License](https://img.shields.io/badge/License-BSD%202--Clause-orange.svg)](https://opensource.org/licenses/BSD-2-Clause)
6[![Doxygen](https://codedocs.xyz/CEED/libCEED.svg)](https://codedocs.xyz/CEED/libCEED/)
7
8## Code for Efficient Extensible Discretization

--- 74 unchanged lines hidden (view full) ---

83## Backends
84
85There are multiple supported backends, which can be selected at runtime in the examples:
86
87| CEED resource | Backend |
88| :----------------------- | :------------------------------------------------ |
89| `/cpu/self/ref/serial` | Serial reference implementation |
90| `/cpu/self/ref/blocked` | Blocked refrence implementation |
91| `/cpu/self/tmpl` | Backend template, dispatches to /cpu/self/blocked |
92| `/cpu/self/avx` | Blocked AVX implementation |
91| `/cpu/self/tmpl` | Backend template, delegates to `/cpu/self/ref/blocked` |
92| `/cpu/self/avx/serial` | Serial AVX implementation |
93| `/cpu/self/avx/blocked` | Blocked AVX implementation |
93| `/cpu/self/xsmm/serial` | Serial LIBXSMM implementation |
94| `/cpu/self/xsmm/blocked` | Blocked LIBXSMM implementation |
95| `/cpu/occa` | Serial OCCA kernels |
96| `/gpu/occa` | CUDA OCCA kernels |
97| `/omp/occa` | OpenMP OCCA kernels |
98| `/ocl/occa` | OpenCL OCCA kernels |
99| `/gpu/cuda` | Pure CUDA kernels |
100| `/gpu/magma` | CUDA MAGMA kernels |
101
102
103The `/cpu/self/*/serial` backends process one element at a time and are intended for meshes
104with a smaller number of high order elements. The `/cpu/self/*/blocked` backends process
105blocked batches of eight interlaced elements and are intended for meshes with higher numbers
106of elements.
107
108The `/cpu/self/ref/*` backends are written in pure C and provide basic functionality.
109
94| `/cpu/self/xsmm/serial` | Serial LIBXSMM implementation |
95| `/cpu/self/xsmm/blocked` | Blocked LIBXSMM implementation |
96| `/cpu/occa` | Serial OCCA kernels |
97| `/gpu/occa` | CUDA OCCA kernels |
98| `/omp/occa` | OpenMP OCCA kernels |
99| `/ocl/occa` | OpenCL OCCA kernels |
100| `/gpu/cuda` | Pure CUDA kernels |
101| `/gpu/magma` | CUDA MAGMA kernels |
102
103
104The `/cpu/self/*/serial` backends process one element at a time and are intended for meshes
105with a smaller number of high order elements. The `/cpu/self/*/blocked` backends process
106blocked batches of eight interlaced elements and are intended for meshes with higher numbers
107of elements.
108
109The `/cpu/self/ref/*` backends are written in pure C and provide basic functionality.
110
110The `/cpu/self/avx` backend relies upon AVX instructions to provide vectorized CPU performance.
111The `/cpu/self/avx/*` backends rely upon AVX instructions to provide vectorized CPU performance.
111
112The `/cpu/self/xsmm/*` backends relies upon the [LIBXSMM](http://github.com/hfp/libxsmm) package
113to provide vectorized CPU performance.
114
115The `/*/occa` backends rely upon the [OCCA](http://github.com/libocca/occa) package to provide
116cross platform performance.
117
118The `/gpu/cuda` backend provides GPU performance strictly using CUDA.

--- 125 unchanged lines hidden ---
112
113The `/cpu/self/xsmm/*` backends relies upon the [LIBXSMM](http://github.com/hfp/libxsmm) package
114to provide vectorized CPU performance.
115
116The `/*/occa` backends rely upon the [OCCA](http://github.com/libocca/occa) package to provide
117cross platform performance.
118
119The `/gpu/cuda` backend provides GPU performance strictly using CUDA.

--- 125 unchanged lines hidden ---