| 94648b7d | 13-Jul-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Organize element restriction variants in ref backend |
| 7c1dbaff | 06-May-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Operator full assembly with oriented or curl-conforming element restrictions (RT or ND elements) |
| 20a93772 | 20-Jul-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Address PR comment on hidden variable and clarify by renaming |
| 0c73c039 | 22-Jun-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Improve element restriction for H(curl) spaces by reorganizing loops, including use of int8_t for tridiagonal matrix |
| 0305e208 | 06-May-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Update backends for unified ElemRestrictionCreate variants for all restriction types (default, oriented, strided) |
| 9475e044 | 27-Jul-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Fix bugs for blocked ElemRestriction apply |
| fcbe8c06 | 24-Apr-2023 |
Sebastian Grimberg <sjg@amazon.com> |
enum CeedRestrictionType for CeedElemRestriction type |
| 77d1c127 | 02-Mar-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Element restriction for high-order (> 1) H(curl) elements requiring more general orientation transformations
Adds CeedElemRestrictionCreateCurlOriented which takes a tridiagonal element-wise transfo
Element restriction for high-order (> 1) H(curl) elements requiring more general orientation transformations
Adds CeedElemRestrictionCreateCurlOriented which takes a tridiagonal element-wise transformation matrix, typically with entries {-1, 0, 1}.
show more ...
|
| 6ca0f394 | 20-Jul-2023 |
Umesh Unnikrishnan <umesh.aero@gatech.edu> |
Add sycl/gen backend and other sycl changes (#1258)
---------
Co-authored-by: Kris Rowe <kris.rowe@anl.gov>
Co-authored-by: Kris Rowe <krowe@anl.gov>
Co-authored-by: Jed Brown <jed@jedbrown.org
Add sycl/gen backend and other sycl changes (#1258)
---------
Co-authored-by: Kris Rowe <kris.rowe@anl.gov>
Co-authored-by: Kris Rowe <krowe@anl.gov>
Co-authored-by: Jed Brown <jed@jedbrown.org>
Co-authored-by: Varsha Madananth <vmadananth@uan-0002.head.cm.americas.sgi.com>
Co-authored-by: James Wright <jrwrigh.iii@gmail.com>
show more ...
|
| 05efb956 | 12-Jul-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Revert undesired change to ceed-ref-tensor.c |
| c8a55531 | 12-Jul-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Combine ceed-avx-tensor-f64 and -f32 into a single file for all precisions |
| 4548da4e | 12-Jul-2023 |
Sebastian Grimberg <sebastiangrimb@gmail.com> |
Update LIBXSMM backend (#1248)
* Fix LIBXSMM kernel generation calls after 9c0e481 in https://github.com/libxsmm/libxsmm
* Update LIBXSMM interface to work with main branch after commit 1f4cdad (
Update LIBXSMM backend (#1248)
* Fix LIBXSMM kernel generation calls after 9c0e481 in https://github.com/libxsmm/libxsmm
* Update LIBXSMM interface to work with main branch after commit 1f4cdad (in preparation for v2)
* Allow user specified BLAS_LIB for LIBXSMM dependency in Makefile
* LIBXSMM does not require kernels to be released
See https://github.com/libxsmm/libxsmm/issues/783\#issuecomment-1596655284.
* Improvements for non-tensor CPU-based CeedBasisApply for q_comp > 1
* Revert previous commit since it's faster to apply in P*Q panels, remove an unncessary LIBXSMM kernel compilation
* Remove an unused macro
* make format
* Rely on LIBXSMM to cache JIT'd kernels
* LIBXSMM dispatched kernels for xsmm/serial backend
* Combine ceed-xsmm-tensor-fp64 and -fp32 into single file for all precisions
* Address PR comments
* Update GitLab CI LIBXSMM version
show more ...
|
| 1f97d2f1 | 07-Jul-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
doc - fixup errors |
| 23d4529e | 07-Jul-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
debug - add debug color enum |
| 8811b53c | 06-Jul-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
memcheck - error if QFunction application gives NaNs |
| dd39767d | 06-Jul-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
memcheck - use CeedSize in for loops as needed |
| df852985 | 06-Jul-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
memcheck - warn in CEED_DEBUG if NaN remains after restoring write-only access |
| 05c335cb | 06-Jul-2023 |
nbeams <246972+nbeams@users.noreply.github.com> |
Add headers missed by IWYU to hip-ref |
| f6f49adb | 30-Jun-2023 |
nbeams <246972+nbeams@users.noreply.github.com> |
Ensure initialization for new L1 norm handling in GPU backends |
| f7c1b517 | 30-Jun-2023 |
nbeams <246972+nbeams@users.noreply.github.com> |
CUDA: improvements for handling large CeedVectors requiring CeedSize for length |
| 3d13c0f2 | 29-Jun-2023 |
nbeams <246972+nbeams@users.noreply.github.com> |
Force use of CeedSize in calculating global idx |
| 90709ca6 | 29-Jun-2023 |
nbeams <246972+nbeams@users.noreply.github.com> |
simplify logic of hipblas use for norm |
| 9330daec | 28-Jun-2023 |
nbeams <246972+nbeams@users.noreply.github.com> |
HIP: Improve support for CeedVectors that are longer than the max size of 32 bit integers |
| ff1e7120 | 15-Jun-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Very minor backend style and whitespace fixes |
| 8e6aa226 | 28-Jun-2023 |
Jed Brown <jed@jedbrown.org> |
Fix -fsanitize=address bugs
note: ‘snprintf’ output between 21 and 31 bytes into a destination of size 30
The assembly test had inconsistent ordering of arguments between the qfunction itself and t
Fix -fsanitize=address bugs
note: ‘snprintf’ output between 21 and 31 bytes into a destination of size 30
The assembly test had inconsistent ordering of arguments between the qfunction itself and the call to CeedQFunctionAddInput.
show more ...
|