| 1a0eda08 | 01-Nov-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Fix regression but from shared-memory non-tensor basis transpose |
| 7132caa0 | 20-Oct-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Remove need to zero out V vector before applying basis transpose for magma backends |
| 9e0c01fa | 20-Oct-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Initial commit to optimize magma backend transpose basis application |
| 833aa127 | 19-Oct-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Use shared-memory for transpose non-tensor basis kernel to accelerate load of A matrix |
| 9d15e85b | 18-Oct-2023 |
Sebastian Grimberg <sjg@amazon.com> |
H(div) and H(curl) basis support for magma backend |
| 940a72f1 | 10-Aug-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Formatting consistency for magma backend with cuda-ref and hip-ref
Includes JiT upgrades for Magma non-tensor basis to only compile for N values which are used at runtime. Adds JiT for Magma non-ten
Formatting consistency for magma backend with cuda-ref and hip-ref
Includes JiT upgrades for Magma non-tensor basis to only compile for N values which are used at runtime. Adds JiT for Magma non-tensor basis CEED_EVAL_WEIGHT mode.
show more ...
|
| 3c1e2aff | 11-Aug-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Formatting updates for MAGMA JiT kernels |
| f80f4a74 | 09-Aug-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Rename files in magma backend for consistency with other libCEED backends |
| cfa13e89 | 14-Sep-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Enforce consistent pointer alignment with clang-format |
| 672b0f2a | 14-Sep-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Fix some missing consistency issues from #1315 |
| ca735530 | 31-Aug-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
style - fixes for CUDA backends |
| 94b7b29b | 01-Sep-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
style - fix header guards |
| b2165e7a | 11-Aug-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Whitespace, style, and formatting updates for consistency between CUDA and HIP backends
Adds include guards in JiT header files, even if not strictly necessary, to match the precedent set in cuda-sh
Whitespace, style, and formatting updates for consistency between CUDA and HIP backends
Adds include guards in JiT header files, even if not strictly necessary, to match the precedent set in cuda-shared and hip-shared as well as sycl.
show more ...
|
| 4a56ddfb | 11-Aug-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Remove two unused files from cuda-shared and hip-shared backends
|
| 49ed4312 | 10-Aug-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Minor formatting consistency for SYCL backend files |
| 58549094 | 15-Aug-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Unify magma backend restriction with cuda/hip-ref, keeping runtime option for deterministic and non-deterministic using magma/det
This also opens the opportunity to make cuda/hip-ref non-determinist
Unify magma backend restriction with cuda/hip-ref, keeping runtime option for deterministic and non-deterministic using magma/det
This also opens the opportunity to make cuda/hip-ref non-deterministic by default and add cuda/hip-ref/det variants.
show more ...
|
| 6ca0f394 | 20-Jul-2023 |
Umesh Unnikrishnan <umesh.aero@gatech.edu> |
Add sycl/gen backend and other sycl changes (#1258)
---------
Co-authored-by: Kris Rowe <kris.rowe@anl.gov>
Co-authored-by: Kris Rowe <krowe@anl.gov>
Co-authored-by: Jed Brown <jed@jedbrown.org
Add sycl/gen backend and other sycl changes (#1258)
---------
Co-authored-by: Kris Rowe <kris.rowe@anl.gov>
Co-authored-by: Kris Rowe <krowe@anl.gov>
Co-authored-by: Jed Brown <jed@jedbrown.org>
Co-authored-by: Varsha Madananth <vmadananth@uan-0002.head.cm.americas.sgi.com>
Co-authored-by: James Wright <jrwrigh.iii@gmail.com>
show more ...
|
| f7c1b517 | 30-Jun-2023 |
nbeams <246972+nbeams@users.noreply.github.com> |
CUDA: improvements for handling large CeedVectors requiring CeedSize for length |
| 9330daec | 28-Jun-2023 |
nbeams <246972+nbeams@users.noreply.github.com> |
HIP: Improve support for CeedVectors that are longer than the max size of 32 bit integers |
| ff1e7120 | 15-Jun-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Very minor backend style and whitespace fixes |
| bd882c8a | 15-Jun-2023 |
James Wright <james@jameswright.xyz> |
Add sycl/ref and sycl/shared backends (#1229)
* Merge sycl_backend from ALCF fork
---------
Co-authored-by: Umesh Unnikrishnan <unnikrishnan@anl.gov>
Co-authored-by: Kris Rowe <krowe@anl.gov>
Add sycl/ref and sycl/shared backends (#1229)
* Merge sycl_backend from ALCF fork
---------
Co-authored-by: Umesh Unnikrishnan <unnikrishnan@anl.gov>
Co-authored-by: Kris Rowe <krowe@anl.gov>
Co-authored-by: Jeremy L Thompson <jeremy@jeremylt.org>
show more ...
|
| 2a86cc9d | 04-Mar-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Fix file endings inconsistency |
| 9bd0a4de | 06-Apr-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
include - move enums to ceed/types.h |
| 023b8a51 | 25-Jan-2023 |
abdelfattah83 <36712794+abdelfattah83@users.noreply.github.com> |
magma: non-tensor rtc (#1141)
* some refactoring in magma's jit src
* fix path
* fix loading src
* refactor magma nontensor backend
* refactor magma nontensor backend
* [WIP]: new non
magma: non-tensor rtc (#1141)
* some refactoring in magma's jit src
* fix path
* fix loading src
* refactor magma nontensor backend
* refactor magma nontensor backend
* [WIP]: new nontensor basis kernels
* [WIP]: new nontensor basis kernels
* [WIP]: new nontensor basis kernels
* call the new nontensor kernels for low order problems
* multiple compilation for the same kernels but with different tuning parmaters
* magma: allow different nb's for different non-tensor kernels
* tuning data for the non-tensor rtc kernels
* remove no-longer used functions, add new one for tuning the nontensor kernels
* constants for tuning
* tuning functions
* use the tuning functions in compiling/running the new kernels
* bug fix
* fixes
* fixes
* minor
* switch tuning data
* fix name
* fix name
* add function to run cuda kernels with opt-in shared memory feature
* minor fix
* minor fix
* fix calls to batch api
* allow more kernel instances
* temporary timing function
* temporary timing function
* tuning data based on hiprtc
* rollback tuning parameters
* fixes
* fixes
* fix inconsistency in the parameters passed to nvrtc/hiprtc
* minor
* a fix to the nb selector
* cleanup
* merge the opt-in feature in CeedRunKernelDimSharedOptinCuda into CeedRunKernelDimSharedCuda
* fix paths for hip-magma backends
* style
* fixes
* running make format
* undo changes from the last commit
* change HIP_DIR to ROCM_DIR and adjust the paths for magma accordingly
* replace HIP_DIR with ROCM_DIR
show more ...
|
| ea61e9ac | 30-Nov-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
minor - assorted formatting fixes |