| c8e372f0 | 13-Mar-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gen - add 3D mixed support |
| c433aabc | 11-Mar-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
cuda - fix 2D flattening |
| 412e5683 | 28-Feb-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - use 2d Flat variants in gen |
| 343e3094 | 26-Feb-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - isolate core 2D tensor logic to allow flat version |
| f725b54b | 26-Feb-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - add P_1D to template args for AtPoints |
| 90c30374 | 18-Mar-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gen - use blocksize of 1 elem AtPoints |
| 99421279 | 10-Mar-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
cuda - use BASIS_T_1D in codegen |
| 826538b3 | 07-Mar-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gen - restrict input/output array pointers |
| 59fa3f92 | 06-Mar-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gen - use field names for clarity |
| 0c8fbeed | 26-Feb-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - gen should use GetArray over GetArrayWrite |
| 087855af | 24-Feb-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - gen put suboperators on separate streams |
| c99afcd8 | 24-Feb-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - gen ApplyAdd functions |
| e9c76bdd | 19-Feb-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - allow running shared kernels on stream |
| ea04d07f | 11-Feb-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - isolate gen ApplyAdd inner logic |
| 45a787f7 | 07-Feb-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - use struct over array for clarity |
| 0a2a6492 | 06-Feb-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
cuda - remove duplicate mats in gen |
| c9192aca | 07-Feb-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - swap out bitwise assignment operators for bools
Co-authored-by: Zach Atkins <zach.atkins@colorado.edu> |
| 8d12f40e | 07-Feb-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
hip - gen fallback to shared if error |
| ddae5012 | 07-Feb-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
cuda - gen fallback to shared if error |
| f82027a4 | 30-Jan-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - update gen non-tensor block strategy |
| 9123fb08 | 29-Jan-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
hip - nontensor gen operators |
| dc007f05 | 27-Jan-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
cuda - nontensor gen operators |
| 3a2968d6 | 17-Dec-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
hip - AtPoints for hip/gen |
| 8b97b69a | 05-Dec-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
cuda - AtPoints for cuda/gen |
| f815fac9 | 09-Dec-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gen - fun name standardization |