| 833aa127 | 19-Oct-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Use shared-memory for transpose non-tensor basis kernel to accelerate load of A matrix |
| 9d15e85b | 18-Oct-2023 |
Sebastian Grimberg <sjg@amazon.com> |
H(div) and H(curl) basis support for magma backend |
| 1d281a7b | 17-Oct-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Remove accidental WIP tuning files from #1382
|
| 913f8461 | 17-Oct-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Adjust include groupings |
| e4e1133f | 17-Oct-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Fix mismatching declaration |
| 940a72f1 | 10-Aug-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Formatting consistency for magma backend with cuda-ref and hip-ref
Includes JiT upgrades for Magma non-tensor basis to only compile for N values which are used at runtime. Adds JiT for Magma non-ten
Formatting consistency for magma backend with cuda-ref and hip-ref
Includes JiT upgrades for Magma non-tensor basis to only compile for N values which are used at runtime. Adds JiT for Magma non-tensor basis CEED_EVAL_WEIGHT mode.
show more ...
|
| f80f4a74 | 09-Aug-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Rename files in magma backend for consistency with other libCEED backends |
| 0cb85d04 | 04-Oct-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Revert opt/blocked backend delegate, this is no longer needed after #1362 |
| 51888a71 | 03-Oct-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Correct opt/blocked fallback to opt/serial for things like QFunction assembly |
| 8130dc29 | 03-Oct-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Minor improvement to consistency between ref/blocked and opt/blocked backends for readability |
| 3bf1f308 | 04-Oct-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
cpu - clean up delegation/fallback between CPU backends |
| a71faab1 | 03-Oct-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Remove unused parameter from CeedTensorContractCreate |
| d53ea278 | 29-Sep-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
magma - fix make tidy issues |
| 1411c262 | 28-Sep-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
memcheck - move codecov markers |
| 35aed383 | 28-Sep-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
ref - fix rstr parent resource name check |
| b31f666e | 28-Sep-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
ref - move codecov markers |
| 1c7d1e03 | 20-Sep-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #1340 from CEED/jeremy/fix-ptsc-orients-copy
Fix CEED_COPY_VALUES for ref rstr at points |
| 506b1a0c | 20-Sep-2023 |
Sebastian Grimberg <sebastiangrimb@gmail.com> |
Non-square operator full assembly (#1316)
* Consistency formatting for operator tests
* Add (failing) test for non-square operator full assembly
* Add support for CPU-based full assembly of no
Non-square operator full assembly (#1316)
* Consistency formatting for operator tests
* Add (failing) test for non-square operator full assembly
* Add support for CPU-based full assembly of non-square CeedOperators
* Fix full assembly of identity quadrature functions and operators with CEED_BASIS_NONE
Also adds a unit test for full assembly which fails prior to the changes in this commit.
* Minor refactor to improve coverage
show more ...
|
| 58c07c4f | 20-Sep-2023 |
Sebastian Grimberg <sebastiangrimb@gmail.com> |
Support CPU shared-memory parallelism with OpenMP (#1279)
* Updates for OpenMP thread safety (one Ceed per thread, potentially with shared output vector)
* Makefile tabs vs. spaces consistency
Support CPU shared-memory parallelism with OpenMP (#1279)
* Updates for OpenMP thread safety (one Ceed per thread, potentially with shared output vector)
* Makefile tabs vs. spaces consistency
* Fix unrelated pragma bug for Intel compilers in `backend.h`
* Address PR feedback: Use _OPENMP macro, simplify OpenMP pragma wrappers
* Address PR feedback: Add new statement macro CeedPragmaThreadPrivate to PREDEFINED entries in Doxyfile
* Add OpenMP support to Intel CI workflow for testing
* Add documentation for OPENMP option and Update releasenotes.md
* Revise OpenMP implementation: Rather than enforcing global variables to be threadprivate, just wrap potential race conditions in a critical block (during registration)
* Avoid returning from OpenMP blocks
* Early break on error when registering backends or QFunctions
* Formatting fixes after rebase, newline after variable declarations
* Address PR feedback: Revert some unintentional changes to debug output
* Update codecov exclusions
show more ...
|
| 07d5dec1 | 20-Sep-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
rstr - fix CEED_COPY_VALUES for ref rstr at points |
| 1249ccc5 | 19-Sep-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
rstr - renaming for clarity |
| 0930e4e7 | 14-Sep-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
rstr - add tests for AtPoints |
| 05fa913c | 13-Sep-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
rstr - allow restriction to ordered points evec |
| 2c7e7413 | 12-Sep-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
rstr - intial CPU implimentation |
| 397164e9 | 15-Sep-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Small update for cuda-shared/hip-shared consistency |