Searched hist:"074 be161bac8d8f2ff6efdceafa0bbdf1835071b" (Results 1 – 3 of 3) sorted by relevance
| /libCEED/backends/cuda-shared/ |
| H A D | ceed-cuda-shared.h | diff 074be161bac8d8f2ff6efdceafa0bbdf1835071b Mon Jun 03 19:41:28 UTC 2019 Yohann Dudouit <yohann.dudouit@gmail.com> Optimization of weight kernel and dynamic allocation of shared memory.
- First optimization of weight kernel, kernels are now coalesce but might not be fully using SMs (need to batch elements per block) - Switch to dynamic shared memory allocation in order to batch elements for interpolation and gradient in cuda-shared backend. - Add GetPreferedMemoryType for cuda-reg and cuda-shared backends. (Can be removed in the future with delegation of this function)
|
| H A D | ceed-cuda-shared.c | diff 074be161bac8d8f2ff6efdceafa0bbdf1835071b Mon Jun 03 19:41:28 UTC 2019 Yohann Dudouit <yohann.dudouit@gmail.com> Optimization of weight kernel and dynamic allocation of shared memory.
- First optimization of weight kernel, kernels are now coalesce but might not be fully using SMs (need to batch elements per block) - Switch to dynamic shared memory allocation in order to batch elements for interpolation and gradient in cuda-shared backend. - Add GetPreferedMemoryType for cuda-reg and cuda-shared backends. (Can be removed in the future with delegation of this function)
|
| H A D | ceed-cuda-shared-basis.c | diff 074be161bac8d8f2ff6efdceafa0bbdf1835071b Mon Jun 03 19:41:28 UTC 2019 Yohann Dudouit <yohann.dudouit@gmail.com> Optimization of weight kernel and dynamic allocation of shared memory.
- First optimization of weight kernel, kernels are now coalesce but might not be fully using SMs (need to batch elements per block) - Switch to dynamic shared memory allocation in order to batch elements for interpolation and gradient in cuda-shared backend. - Add GetPreferedMemoryType for cuda-reg and cuda-shared backends. (Can be removed in the future with delegation of this function)
|