| b73fa92c | 24-Jun-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
style - also format .cu files |
| 956a3dba | 24-Jun-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
hip - fix references to .cu |
| 3196072f | 24-Jun-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
hip - add Vec*Strided utils |
| f1c2287b | 24-Jun-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
cuda - add Vec*Strided utils |
| 9ef22048 | 24-Jun-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
cuda - follow multi line statement conventions |
| 097cc795 | 21-Jun-2024 |
James Wright <james@jameswright.xyz> |
basis: CreateProjection set q_ref, q_weight to NULL |
| fc0f7cc6 | 31-May-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
basis - ApplyAtPoints should take number of elem |
| 75765c5e | 29-May-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
minor - fix typo |
| 56ef6bd5 | 28-May-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #1591 from CEED/jeremy/ref-clean-access
Clean up QF assembly memory access |
| ff8551c5 | 28-May-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
style - num_active_* => qf_size_* |
| 1a3e18b3 | 23-May-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
memcheck - vec writable buffer |
| c7b67790 | 23-May-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
cpu - clean up QF assembly memory access |
| 29ec485e | 22-May-2024 |
Jed Brown <jed@jedbrown.org> |
backends/cuda: NVRTC compile to CUBIN when supported (resolve #1587)
This allows using a newer CUDA runtime with an older driver, and seems to have no downsides.
NVRTC can generate cubins directl
backends/cuda: NVRTC compile to CUBIN when supported (resolve #1587)
This allows using a newer CUDA runtime with an older driver, and seems to have no downsides.
NVRTC can generate cubins directly starting with CUDA 11.1. [...] NVRTC used to support only virtual architectures through the option -arch, since it was only emitting PTX. It will now support actual architectures as well to emit SASS. The interface is augmented to retrieve either the PTX or cubin if an actual architecture is specified.
https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html#dynamic-code-generation
show more ...
|
| 0b63de31 | 17-May-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
rstr - transpose AtPoints restriction fixes |
| 8be297ee | 14-May-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
points - fix gpu conversion to standard indexing for num_comp with rstr AtPoints |
| 0c7f167f | 14-May-2024 |
Zach Atkins <zach.atkins@colorado.edu> |
Update backends/ref/ceed-ref-operator.c
Co-authored-by: Jeremy L Thompson <jeremy@jeremylt.org> |
| 64a7ec2f | 14-May-2024 |
Zach Atkins <zach.atkins@colorado.edu> |
Fix bug in diagonal assembly for point operators |
| 831877b7 | 09-May-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
tidy - minor fix of unititalized value warning |
| b20a4af9 | 08-May-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
cuda - add ElemRestrictionAtPoints |
| 637baffd | 08-May-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
rstr - minor naming fix |
| ff1bc20e | 08-May-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
rstr - allow backends to add padding to AtPoints E-vec |
| fe960054 | 07-May-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
hip - add ElemRestrictionAtPoints |
| f7488153 | 02-May-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #1464 from CEED/jeremy/libxsmm-update
xsmm - update for function name change |
| f220c67c | 06-Feb-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
xsmm - update for function name change |
| fb133d4b | 25-Apr-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
cpu - add AssembleAddDiagonal for AtPoints Operator |