| 99421279 | 10-Mar-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
cuda - use BASIS_T_1D in codegen |
| af0e6e89 | 13-Feb-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - add Transpose/TransposeAdd variants for AtPoints |
| a24d84ea | 09-Jan-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - fix AtPoints transpose shift |
| 6c13bbcb | 07-Jan-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
hip - add nontensor shared |
| aa4002ad | 03-Jan-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - use gen LoadMatrix in shared |
| 9ff05d55 | 03-Jan-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
cuda - add nontensor shared |
| 688b5473 | 13-Dec-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
minor - fix style in gen templates |
| 4eda27c2 | 13-Dec-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - minor fix to 1d AtPoints basis transpose |
| 8b97b69a | 05-Dec-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
cuda - AtPoints for cuda/gen |
| b6a2eb79 | 09-Dec-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
shared - AtPoints template changes for gen |
| f815fac9 | 09-Dec-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gen - fun name standardization |
| 5f954c19 | 27-Nov-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - template loop over points for basis action |
| 9e1d4b82 | 07-Nov-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - shared AtPoints |
| 81ae6159 | 11-Nov-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - split AtPoints basis between Transpose/no |
| 6a96780f | 18-Oct-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
minor - header consistency |
| c0b5abf0 | 17-Oct-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
qf - prefer ceed/types.h over ceed.h in qf source |
| 111870fe | 04-Sep-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
AtPoints - fix transpose basis apply on GPU |
| db2becc9 | 13-Aug-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Add CeedBasisApplyAdd (#1644)
* basis - add CeedBasisApplyAdd + CPU impl
* basis - add ref GPU ApplyAdd
* basis - add shared GPU ApplyAdd
* basis - add MAGMA ApplyAdd
* basis - add CeedB
Add CeedBasisApplyAdd (#1644)
* basis - add CeedBasisApplyAdd + CPU impl
* basis - add ref GPU ApplyAdd
* basis - add shared GPU ApplyAdd
* basis - add MAGMA ApplyAdd
* basis - add CeedBasisApplyAddAtPoints + default impl
* basis - add GPU ApplyAddAtPoints
* tidy - add extra assert to fix clang-tidy
* Apply suggestions from code review
style - consistently use indexing over pointer arithmatic
Co-authored-by: Zach Atkins <zach.atkins@colorado.edu>
* style - more pointer fixes
---------
Co-authored-by: Zach Atkins <zach.atkins@colorado.edu>
show more ...
|
| 80c135a8 | 10-Jul-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - less explicit memory shuffling to build Chebyshev der |
| ad8059fc | 10-Jul-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - reduce write conflits for AtPoints basis operations |
| f7c9815f | 20-Jun-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
AtPoints - ease memory requirement |
| 2d10e82c | 17-Jun-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
AtPoints - fix gpu thread usage |
| 34d14614 | 30-May-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
cuda - impl BasisApplyAtPoints |
| 2d903c70 | 12-Jun-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - drop extra copy in ref tensor basis interp |
| 0b2e4913 | 12-Jun-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
style - bool name consistency |