| #
e75c1c2d
|
| 29-Jun-2020 |
Jeremy L Thompson <thompson.jeremy.luke@gmail.com> |
make tidy
|
| #
3f1466f8
|
| 26-Jun-2020 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Merge pull request #561 from CEED/jeremy/is-deterministic
Ceed - add IsDeterministic
|
| #
52d8ac88
|
| 25-Jun-2020 |
Jeremy L Thompson <thompson.jeremy.luke@gmail.com> |
CUDA - add missing codecov exceptions
|
| #
9525855c
|
| 17-Jun-2020 |
Jeremy L Thompson <thompson.jeremy.luke@gmail.com> |
Ceed - add IsDeterministic
|
| #
4d36c801
|
| 24-Jun-2020 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Merge pull request #568 from CEED/jeremy/cuda-destroy
Small CUDA Tidying
|
| #
7df94212
|
| 23-Jun-2020 |
Jeremy L Thompson <thompson.jeremy.luke@gmail.com> |
CUDA - clean up includes
|
| #
73b3ccaf
|
| 23-Jun-2020 |
Jeremy L Thompson <thompson.jeremy.luke@gmail.com> |
CUDA - clean up minor memory leak
|
| #
65275b31
|
| 13-May-2020 |
valeriabarra <valeriabarra21@gmail.com> |
Merge branch 'master' into valeria/NSfixes
|
| #
ab213215
|
| 23-Apr-2020 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
More comments in CUDA backends (#518)
* CUDA - adding comments as I work to understand these backends
* PETSc - remove extra include, breaks single source
* make style
|
| #
621cd461
|
| 16-Mar-2020 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Merge pull request #421 from SanderA/sanderarens/fix_ceed_cuda_subclasses
Add Ceed_Cuda struct to Ceed_Cuda_ref/shared/gen.
|
| #
c00ee0d7
|
| 23-Nov-2019 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Merge pull request #413 from CEED/jeremy/op-add
Add CeedOperatorApplyAdd
|
| #
5afe0718
|
| 23-Nov-2019 |
jeremylt <jeremy.thompson@colorado.edu> |
CUDA - fix up composite operator delegation
|
| #
abfaacbb
|
| 17-Nov-2019 |
Sander Arens <sanderarens@gmail.com> |
Add Ceed_Cuda struct to Ceed_Cuda_ref/shared/gen.
Now Ceed_Cuda_ref/shared/gen act like subclasses and can be properly cast to Ceed_Cuda.
|
| #
1226057f
|
| 27-Jun-2019 |
Yohann Dudouit <yohann.dudouit@gmail.com> |
Merge branch 'master' into yohann/cuda-restr-opt
Conflicts: backends/cuda-reg/ceed-cuda-reg-restriction.c backends/cuda-shared/ceed-cuda-shared-basis.c
|
| #
9d77422e
|
| 26-Jun-2019 |
Jed Brown <jed@jedbrown.org> |
Merge branch 'yohann/cuda-non-tensor' [PR #249]
* yohann/cuda-non-tensor: ceed-cuda: resolve -Wsign-compare for CUresult (unsigned enum) in CeedError make style. namespace cuda backends functi
Merge branch 'yohann/cuda-non-tensor' [PR #249]
* yohann/cuda-non-tensor: ceed-cuda: resolve -Wsign-compare for CUresult (unsigned enum) in CeedError make style. namespace cuda backends functions. Minor: styling Add CUDA_LIB_DIR_STUBS for systems that don't have CUDA drivers installed make style Remove useless function declaration. Add a reference non-tensor BasisApply for cuda backends.
show more ...
|
| #
df4cfd6d
|
| 04-Jun-2019 |
Yohann Dudouit <yohann.dudouit@gmail.com> |
Remove dead or unnecessary code.
|
| #
074be161
|
| 03-Jun-2019 |
Yohann Dudouit <yohann.dudouit@gmail.com> |
Optimization of weight kernel and dynamic allocation of shared memory.
- First optimization of weight kernel, kernels are now coalesce but might not be fully using SMs (need to batch elements per
Optimization of weight kernel and dynamic allocation of shared memory.
- First optimization of weight kernel, kernels are now coalesce but might not be fully using SMs (need to batch elements per block) - Switch to dynamic shared memory allocation in order to batch elements for interpolation and gradient in cuda-shared backend. - Add GetPreferedMemoryType for cuda-reg and cuda-shared backends. (Can be removed in the future with delegation of this function)
show more ...
|
| #
1856ee7c
|
| 29-May-2019 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Merge pull request #239 from CEED/decorator
Add delegates for specific objects
|
| #
a4999edd
|
| 24-May-2019 |
jeremylt <jeremy.thompson@colorado.edu> |
Update Ceed Delegate refrencing
|
| #
9ad45357
|
| 16-May-2019 |
Yohann Dudouit <yohann.dudouit@gmail.com> |
Add a reference non-tensor BasisApply for cuda backends.
|
| #
c532df63
|
| 16-May-2019 |
Yohann <dudouit1@llnl.gov> |
Cuda backend using shared memory (#247)
Add a GPU backend based on Cuda using shared memory.
* Draft of a shared memory backend
* New basis apply passes all tests.
* Add the possibility to
Cuda backend using shared memory (#247)
Add a GPU backend based on Cuda using shared memory.
* Draft of a shared memory backend
* New basis apply passes all tests.
* Add the possibility to treat several elements in one block of threads.
* Fix an error in 2D and 3D gradient.
* Put the cuda-shared backend in its own folder.
* Minor cleaning.
* Replace <ceed-impl.h> with <ceed-backend.h>
* make style
* Add a few CeedChk_Cu
show more ...
|