History log of /libCEED/backends/ (Results 876 – 900 of 1139)
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
bec1c03430-May-2019 jeremylt <jeremy.thompson@colorado.edu>

NOLINT for OCCA tensor contract false positive

f5ef5ec029-May-2019 jeremylt <jeremy.thompson@colorado.edu>

OCCA Backend clang-tidy fixes

a4999edd24-May-2019 jeremylt <jeremy.thompson@colorado.edu>

Update Ceed Delegate refrencing

aefd837829-Apr-2019 jeremylt <jeremy.thompson@colorado.edu>

Add delegates for specific objects

f8902d9e24-May-2019 jeremylt <jeremy.thompson@colorado.edu>

VecCreate -> VectorCreate

89c6efa403-May-2019 jeremylt <jeremy.thompson@colorado.edu>

Use blocking in optimized serial backends

045b9c4729-Mar-2019 jeremylt <jeremy.thompson@colorado.edu>

Include full evec blocked backend

a765294228-Mar-2019 jeremylt <jeremy.thompson@colorado.edu>

Add restriction by block to /cpu/self/*/blocked

be9261b728-Mar-2019 jeremylt <jeremy.thompson@colorado.edu>

Add ElemRestrictionApplyBlock

abe33e5416-May-2019 Yohann Dudouit <yohann.dudouit@gmail.com>

make style

469f022016-May-2019 Yohann Dudouit <yohann.dudouit@gmail.com>

Remove useless function declaration.

9ad4535716-May-2019 Yohann Dudouit <yohann.dudouit@gmail.com>

Add a reference non-tensor BasisApply for cuda backends.

c532df6316-May-2019 Yohann <dudouit1@llnl.gov>

Cuda backend using shared memory (#247)

Add a GPU backend based on Cuda using shared memory.

* Draft of a shared memory backend

* New basis apply passes all tests.

* Add the possibility to

Cuda backend using shared memory (#247)

Add a GPU backend based on Cuda using shared memory.

* Draft of a shared memory backend

* New basis apply passes all tests.

* Add the possibility to treat several elements in one block of threads.

* Fix an error in 2D and 3D gradient.

* Put the cuda-shared backend in its own folder.

* Minor cleaning.

* Replace <ceed-impl.h> with <ceed-backend.h>

* make style

* Add a few CeedChk_Cu

show more ...

8d75ea1b18-Apr-2019 jeremylt <jeremy.thompson@colorado.edu>

Fix include statements

fc7cf9a018-Apr-2019 jeremylt <jeremy.thompson@colorado.edu>

Set QFunction outputs undefined before apply in new memcheck backend


/libCEED/.travis.yml
/libCEED/Makefile
/libCEED/README.md
memcheck/ceed-memcheck-qfunction.c
memcheck/ceed-memcheck.c
memcheck/ceed-memcheck.h
ref/ceed-ref-qfunction.c
/libCEED/examples/navier-stokes/navierstokes.c
/libCEED/examples/nek5000/.gitignore
/libCEED/examples/nek5000/bp1.usr
/libCEED/examples/nek5000/bp3.usr
/libCEED/interface/ceed-fortran.c
/libCEED/tests/junit.py
/libCEED/tests/t100-vec-f.f90
/libCEED/tests/t101-vec-f.f90
/libCEED/tests/t102-vec-f.f90
/libCEED/tests/t103-vec-f.f90
/libCEED/tests/t105-vec-f.f90
/libCEED/tests/t106-vec-f.f90
/libCEED/tests/t108-vec-f.f90
/libCEED/tests/t109-vec-f.f90
/libCEED/tests/t109-vec.c
/libCEED/tests/t110-vec-f.f90
/libCEED/tests/t110-vec.c
/libCEED/tests/t200-elemrestriction-f.f90
/libCEED/tests/t201-elemrestriction-f.f90
/libCEED/tests/t202-elemrestriction-f.f90
/libCEED/tests/t203-elemrestriction-f.f90
/libCEED/tests/t204-elemrestriction-f.f90
/libCEED/tests/t205-elemrestriction-f.f90
/libCEED/tests/t206-elemrestriction-f.f90
/libCEED/tests/t207-elemrestriction-f.f90
/libCEED/tests/t301-basis-f.f90
/libCEED/tests/t302-basis-f.f90
/libCEED/tests/t303-basis-f.f90
/libCEED/tests/t304-basis-f.f90
/libCEED/tests/t305-basis-f.f90
/libCEED/tests/t311-basis-f.f90
/libCEED/tests/t312-basis-f.f90
/libCEED/tests/t313-basis-f.f90
/libCEED/tests/t400-qfunction-f.f90
/libCEED/tests/t401-qfunction-f.f90
/libCEED/tests/t500-operator-f.f90
/libCEED/tests/t501-operator-f.f90
/libCEED/tests/t502-operator-f.f90
/libCEED/tests/t510-operator-f.f90
/libCEED/tests/t511-operator-f.f90
/libCEED/tests/t520-operator-f.f90
/libCEED/tests/t521-operator-f.f90
/libCEED/tests/tap.sh
30ea05eb06-May-2019 Yohann Dudouit <yohann.dudouit@gmail.com>

Force Context existence with cudaFree(0).

5e9d07a706-May-2019 Yohann Dudouit <yohann.dudouit@gmail.com>

Modify the device initialization

974a6da529-Apr-2019 Yohann Dudouit <yohann.dudouit@gmail.com>

Fix CeedChk with CeedChk_Cu in the Cuda backend.

56f1838c28-Mar-2019 Yohann Dudouit <yohann.dudouit@gmail.com>

Add atomicAdd in /cuda/ref backend for compute capability < 6.0

c907536f27-Mar-2019 jeremylt <jeremy.thompson@colorado.edu>

Add CeedGetPreferredMemType

656dd4b724-Mar-2019 jeremylt <jeremy.thompson@colorado.edu>

Add error message if XSMM kernel fails to build

3d0fd66421-Mar-2019 jeremylt <jeremy.thompson@colorado.edu>

Add kernel caching to XSMM backend

Make style and comment updates

XSMM tensor ind logic fix

Logic cleanup

c71e1dcd20-Mar-2019 jeremylt <jeremy.thompson@colorado.edu>

Add Basis argument to TensorContractCreate

de68657114-Mar-2019 Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com>

Small clang-tidy fixes (#215)

55ae60f914-Mar-2019 Yohann <yohann.dudouit@gmail.com>

Simple Cuda backend using one thread per element (#195)

Thanks-to: Jeremy Thompson

* Take into account the compute capability of the GPU

* Add the cuda/reg backend and rename cuda to cuda/ref.

Simple Cuda backend using one thread per element (#195)

Thanks-to: Jeremy Thompson

* Take into account the compute capability of the GPU

* Add the cuda/reg backend and rename cuda to cuda/ref.

- cuda/reg uses a simple approach where each element is
processed by one thread. This approach is expected to be
efficient for 1D and 2D problems, but very ineficient
as soon as the kernels start to spill, which should arise
around Q1D=4 for 3D problems.

* Compilation takes into account the deviceId

* Make style

* Remove dead code in cuda qFunctions.

* Cuda-reg specialized Restriction.

* Split the Prolongation operator into Identity/not Identity.

* Remove "#pragma unroll" until further perf investigation.

* README update

* Add a description of cuda/reg.

* Add CompositeOperator msg to CUDA backends

show more ...

1...<<31323334353637383940>>...46