History log of /libCEED/backends/ (Results 851 – 875 of 1139)
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
2acd992419-Aug-2019 Yohann Dudouit <yohann.dudouit@gmail.com>

Working on cc >= 6

f1a13f7719-Aug-2019 Yohann Dudouit <yohann.dudouit@gmail.com>

Remove atomicAdd function for compute capabilities > sm_60

241a4b8325-Jul-2019 Yohann <yohann.dudouit@gmail.com>

Full jit compiled operator: cuda-gen backend (#275)

* First steps toward cuda-gen backend!

* Closer to real code generation.

* Generated code should be ready for nvrtc.

* The code generatio

Full jit compiled operator: cuda-gen backend (#275)

* First steps toward cuda-gen backend!

* Closer to real code generation.

* Generated code should be ready for nvrtc.

* The code generation skeleton is ready.

* Hack with the qfunction to make the operator kernel compile.

* Some tweaks in the makefile + Input fields structure change.

* Remove using cout.

* 1d interp and grad device functions.

* 1d readDofs, readQuads, writeDofs, writeQuads.

* Remove dead code.

* readDofs, readQuads, writeDofs, writeQuads for 2d and 3d

* 2d interp and grad

* 3d interp and grad

* - weight functions for 1d,2d,3d
- link the indices to the kernel
- link the fields to the kernel
- link the basis to the kernel

* Add the qFunction reader + inlining

* Add qf files for the tests.

* Add qf file for ceed/ex1

* Add qf file for mfem/bp1

* All tests pass.

* Add qFunction for mfem/bp3, petsc/bp1, and petsc/bp3.

* mfem/bp1 passes + remove dead code

* Fix a bug in n_quads_out for writeQuads

* mfem/bp3 passes.

* All tests all examples pass.

* Temporary tweaks for mfem benchmarking

* Add Context management.

* Modify .qf files to take into account the context.

* Enable optimizations.

* First set of optimization for 2D and 3D.

* Makefile tweaks and destructor code.

* make style.

* Add -MP flag.

* Fix linking issues with the tests.

* Update .qf files for the tests.

* Add .qf files for nek5000 examples.

* Use shared memory for B and G matrices.

* Fix bug introduced in previous commit.

show more ...


/libCEED/.travis.yml
/libCEED/Makefile
/libCEED/README.md
cuda-gen/ceed-cuda-gen-operator-build.cpp
cuda-gen/ceed-cuda-gen-operator-build.h
cuda-gen/ceed-cuda-gen-operator.c
cuda-gen/ceed-cuda-gen-qfunction.c
cuda-gen/ceed-cuda-gen.c
cuda-gen/ceed-cuda-gen.h
cuda/ceed-cuda.h
/libCEED/examples/Makefile
/libCEED/examples/README.md
/libCEED/examples/ceed/ex1.qf
/libCEED/examples/mfem/bp1.qf
/libCEED/examples/mfem/bp3.qf
/libCEED/examples/nek/.gitignore
/libCEED/examples/nek/Makefile
/libCEED/examples/nek/README.md
/libCEED/examples/nek/SIZE.in
/libCEED/examples/nek/boxes/b.box
/libCEED/examples/nek/boxes/b1e.rea
/libCEED/examples/nek/bps/bps.cu
/libCEED/examples/nek/bps/bps.okl
/libCEED/examples/nek/bps/bps.usr
/libCEED/examples/nek/nek-examples.sh
/libCEED/examples/nek5000/bp1.qf
/libCEED/examples/nek5000/bp3.qf
/libCEED/examples/petsc/bp1.qf
/libCEED/examples/petsc/bp2.qf
/libCEED/examples/petsc/bp3.qf
/libCEED/examples/petsc/bp4.qf
/libCEED/examples/petsc/bps.c
/libCEED/examples/petsc/common.qf
/libCEED/include/ceed-backend.h
/libCEED/tests/junit.py
/libCEED/tests/t209-elemrestriction-f.f90
/libCEED/tests/t400-qfunction-f.qf
/libCEED/tests/t400-qfunction.qf
/libCEED/tests/t401-qfunction-f.qf
/libCEED/tests/t401-qfunction.qf
/libCEED/tests/t500-operator-f.qf
/libCEED/tests/t500-operator.qf
/libCEED/tests/t501-operator-f.qf
/libCEED/tests/t501-operator.qf
/libCEED/tests/t502-operator-f.qf
/libCEED/tests/t502-operator.qf
/libCEED/tests/t510-operator-f.qf
/libCEED/tests/t510-operator.qf
/libCEED/tests/t511-operator-f.qf
/libCEED/tests/t511-operator.qf
/libCEED/tests/t520-operator-f.qf
/libCEED/tests/t520-operator.qf
/libCEED/tests/t521-operator-f.qf
/libCEED/tests/t521-operator.qf
/libCEED/tests/tap.sh
706bc5e618-Jul-2019 jeremylt <jeremy.thompson@colorado.edu>

backends: fix ref backend priorities

6f7d248d12-Jul-2019 jeremylt <jeremy.thompson@colorado.edu>

Update CPU backends to give default for /cpu/self/***

e0fc044712-Jul-2019 jeremylt <jeremy.thompson@colorado.edu>

Fix resource strcmp in xsmm backends

f405f80604-Jul-2019 Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com>

Merge pull request #289 from CEED/cuda-occa-copy-vals

Update CUDA/OCCA CEED_COPY_VALUES logic

ea03cb9503-Jul-2019 jeremylt <jeremy.thompson@colorado.edu>

Update CUDA/OCCA CEED_COPY_VALUES logic

1226057f27-Jun-2019 Yohann Dudouit <yohann.dudouit@gmail.com>

Merge branch 'master' into yohann/cuda-restr-opt

Conflicts:
backends/cuda-reg/ceed-cuda-reg-restriction.c
backends/cuda-shared/ceed-cuda-shared-basis.c

9d77422e26-Jun-2019 Jed Brown <jed@jedbrown.org>

Merge branch 'yohann/cuda-non-tensor' [PR #249]

* yohann/cuda-non-tensor:
ceed-cuda: resolve -Wsign-compare for CUresult (unsigned enum) in CeedError
make style.
namespace cuda backends functi

Merge branch 'yohann/cuda-non-tensor' [PR #249]

* yohann/cuda-non-tensor:
ceed-cuda: resolve -Wsign-compare for CUresult (unsigned enum) in CeedError
make style.
namespace cuda backends functions.
Minor: styling
Add CUDA_LIB_DIR_STUBS for systems that don't have CUDA drivers installed
make style
Remove useless function declaration.
Add a reference non-tensor BasisApply for cuda backends.

show more ...

ab7ab56023-Jun-2019 Jed Brown <jed@jedbrown.org>

ceed-cuda: resolve -Wsign-compare for CUresult (unsigned enum) in CeedError

961116ec17-Jun-2019 Yohann Dudouit <yohann.dudouit@gmail.com>

make style.

4a6d4bbd17-Jun-2019 Yohann Dudouit <yohann.dudouit@gmail.com>

namespace cuda backends functions.

0109ba8604-Jun-2019 Yohann Dudouit <yohann.dudouit@gmail.com>

Minor: styling

a7bd39da10-Jun-2019 jeremylt <jeremy.thompson@colorado.edu>

Fix underinterpolation mode for /cpu/self backends

df4cfd6d04-Jun-2019 Yohann Dudouit <yohann.dudouit@gmail.com>

Remove dead or unnecessary code.

3f63d31804-Jun-2019 Yohann Dudouit <yohann.dudouit@gmail.com>

Remove dead code. Cuda-reg restriction optimization.

698ebc3503-Jun-2019 Yohann Dudouit <yohann.dudouit@gmail.com>

Optimization of 3D kernels for cuda-shared backend.

d94769d203-Jun-2019 Yohann Dudouit <yohann.dudouit@gmail.com>

Optimization of 1D kernels for cuda-shared backend.

4247ecf303-Jun-2019 Yohann Dudouit <yohann.dudouit@gmail.com>

Optimization of 2D kernels for cuda-shared backend.

717ff8a303-Jun-2019 Yohann Dudouit <yohann.dudouit@gmail.com>

Minor bug fix

074be16103-Jun-2019 Yohann Dudouit <yohann.dudouit@gmail.com>

Optimization of weight kernel and dynamic allocation of shared memory.

- First optimization of weight kernel, kernels are now coalesce but
might not be fully using SMs (need to batch elements per

Optimization of weight kernel and dynamic allocation of shared memory.

- First optimization of weight kernel, kernels are now coalesce but
might not be fully using SMs (need to batch elements per block)
- Switch to dynamic shared memory allocation in order to batch elements
for interpolation and gradient in cuda-shared backend.
- Add GetPreferedMemoryType for cuda-reg and cuda-shared backends.
(Can be removed in the future with delegation of this function)

show more ...

d3232bb730-May-2019 Yohann Dudouit <yohann.dudouit@gmail.com>

Optimization of cuda-reg restriction.

9ef2071317-May-2019 Yohann Dudouit <yohann.dudouit@gmail.com>

Start the optimization of the Cuda restriction operator.

103dcb4231-May-2019 jeremylt <jeremy.thompson@colorado.edu>

OCCA backend update note

1...<<31323334353637383940>>...46