Searched hist:"2 dc3fb5f4d99263629ede9783b5752ff8ee2177f" (Results 1 – 2 of 2) sorted by relevance
| /libCEED/backends/magma/ |
| H A D | ceed-magma.h | diff 2dc3fb5f4d99263629ede9783b5752ff8ee2177f Wed Aug 31 19:53:22 UTC 2022 abdelfattah83 <36712794+abdelfattah83@users.noreply.github.com> Icl/magma ntgemm (#1060)
* tuning data and driver for the non-tensor gemm
* header
* update magma non-tensor sgemm/dgemm to use the gemm selector
* add cpp files for the magma backend
* minor fix
* define CEED_INTERN for every function instead of a block definition
* include tuning data for CUDA or HIP only
* recent tuning data for a100 and mi250x
* style
* remove unused declarations
* expand tuning data for v100 and mi100
* switch to std array instead of std vector for individual records
* choose between gfx90a and gfx908 for HIP
* bug fix: choose between magma and vendor blas in non-batch mode
* style
|
| /libCEED/ |
| H A D | Makefile | diff 2dc3fb5f4d99263629ede9783b5752ff8ee2177f Wed Aug 31 19:53:22 UTC 2022 abdelfattah83 <36712794+abdelfattah83@users.noreply.github.com> Icl/magma ntgemm (#1060)
* tuning data and driver for the non-tensor gemm
* header
* update magma non-tensor sgemm/dgemm to use the gemm selector
* add cpp files for the magma backend
* minor fix
* define CEED_INTERN for every function instead of a block definition
* include tuning data for CUDA or HIP only
* recent tuning data for a100 and mi250x
* style
* remove unused declarations
* expand tuning data for v100 and mi100
* switch to std array instead of std vector for individual records
* choose between gfx90a and gfx908 for HIP
* bug fix: choose between magma and vendor blas in non-batch mode
* style
|