| #
d4cc1845
|
| 30-Dec-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #1912 from CEED/jeremy/copyright
minor - update copyright to 2026
|
| #
9ba83ac0
|
| 19-Dec-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
minor - update copyright to 2026
|
| #
20a16a5f
|
| 20-Mar-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #1786 from CEED/jeremy/copy-headers
minor - upate copyright to 2025
|
| #
d275d636
|
| 19-Mar-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
minor - upate copyright to 2025
|
| #
8c2d8641
|
| 13-Feb-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #1754 from CEED/jeremy/shared-points-transpose
gpu - add Transpose/TransposeAdd variants for AtPoints
|
| #
af0e6e89
|
| 13-Feb-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - add Transpose/TransposeAdd variants for AtPoints
|
| #
1a63be7e
|
| 09-Jan-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #1721 from CEED/jeremy/shared-nontensor
Add non-tensor shared
|
| #
6c13bbcb
|
| 07-Jan-2025 |
Jeremy L Thompson <jeremy@jeremylt.org> |
hip - add nontensor shared
|
| #
be8d6f55
|
| 12-Nov-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #1710 from CEED/jeremy/split-at-points
Split AtPoints basis between Transpose/no
|
| #
81ae6159
|
| 11-Nov-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - split AtPoints basis between Transpose/no
|
| #
25c4e04a
|
| 05-Sep-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #1655 from CEED/jeremy/at-points-transpose
AtPoints - fix transpose basis apply on GPU
|
| #
111870fe
|
| 04-Sep-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
AtPoints - fix transpose basis apply on GPU
|
| #
db2becc9
|
| 13-Aug-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Add CeedBasisApplyAdd (#1644)
* basis - add CeedBasisApplyAdd + CPU impl
* basis - add ref GPU ApplyAdd
* basis - add shared GPU ApplyAdd
* basis - add MAGMA ApplyAdd
* basis - add CeedB
Add CeedBasisApplyAdd (#1644)
* basis - add CeedBasisApplyAdd + CPU impl
* basis - add ref GPU ApplyAdd
* basis - add shared GPU ApplyAdd
* basis - add MAGMA ApplyAdd
* basis - add CeedBasisApplyAddAtPoints + default impl
* basis - add GPU ApplyAddAtPoints
* tidy - add extra assert to fix clang-tidy
* Apply suggestions from code review
style - consistently use indexing over pointer arithmatic
Co-authored-by: Zach Atkins <zach.atkins@colorado.edu>
* style - more pointer fixes
---------
Co-authored-by: Zach Atkins <zach.atkins@colorado.edu>
show more ...
|
| #
37fb1fa7
|
| 11-Jul-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #1597 from CEED/jeremy/basis-at-points-gpu
GPU BasisApplyAtPoints
|
| #
1dda9c1a
|
| 17-Jun-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - add intial AtPoints to shared mem backends, but using ref impl
|
| #
509d4af6
|
| 28-Mar-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Drop JiT Guards in Most QF Source (#1540)
* backend - use pragma once
* gallery - drop source header guards
* ex - drop some qfunction source header guards
* fluids - drop guards on all sin
Drop JiT Guards in Most QF Source (#1540)
* backend - use pragma once
* gallery - drop source header guards
* ex - drop some qfunction source header guards
* fluids - drop guards on all singly included headers
* jit - drop most guards on backend JiT files
* sycl - drop extra header guards
* jit - enable #pragma once for QF source
* fluids - use #pragma once for util/helper qf source
* test - check different multiple includes
* fluids - fix odd include
* jit - update interface for building JiT string from multiple files
show more ...
|
| #
a171b6ef
|
| 27-Mar-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #1537 from CEED/jeremy/pragma-once
Use #pragma once for non-JiT headers
|
| #
5aed82e4
|
| 27-Mar-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
minor - update copyright headers
|
| #
31c137a9
|
| 01-Sep-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #1320 from CEED/jeremy/jit-header-guards
style - fix header guards
|
| #
94b7b29b
|
| 01-Sep-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
style - fix header guards
|
| #
6e6704a8
|
| 19-Apr-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #1198 from CEED/jeremy/CeedCheck
Add CeedCheck macro to reduce repetition
|
| #
6574a04f
|
| 18-Apr-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
internal - add CeedCheck macro to reduce repetition
|
| #
49aac155
|
| 24-Mar-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
IWYU fixes (#1182)
* iwyu - include fixes
* iwyu - silence some iwyu output
* minor - clearer macro names
* iwyu - fix suggestion of "ceed/ceed.h" externally
* iwyu - lighter petsc heade
IWYU fixes (#1182)
* iwyu - include fixes
* iwyu - silence some iwyu output
* minor - clearer macro names
* iwyu - fix suggestion of "ceed/ceed.h" externally
* iwyu - lighter petsc headers
* iwyu - ceed/ceed.h -> ceed.h
* iwyu - cuda/hip include fixes
show more ...
|
| #
2b730f8b
|
| 17-Nov-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Switch to clang-format (#1051)
* style - switch to clang-format
* ci - use newer libxsmm
* action - update format action
* format - consistent use of {} for multi-line if/for
* make - re
Switch to clang-format (#1051)
* style - switch to clang-format
* ci - use newer libxsmm
* action - update format action
* format - consistent use of {} for multi-line if/for
* make - remove stray newline
* make - simpler 'make format' target
* ci - use newer libxsmm
* doc - minor release note claification
* minor - minor fix
* minor - minor fix
* minor - minor fix
* minor - minor fix
* make format
* format - less aggressive alignment rules
* tidy - check for argument name mismatches
* fix newline
* format - mirror Ratel update to .clang-format
* fix merge error
* fix merge conflict
* fix merge error
* drop style in .phony list
* Update .clang-format
Co-authored-by: Jed Brown <jed@jedbrown.org>
* apply updated format
Co-authored-by: Jed Brown <jed@jedbrown.org>
show more ...
|
| #
9e201c85
|
| 23-Sep-2022 |
Yohann <dudouit1@llnl.gov> |
Refactor `cuda-gen` and `hip-gen` backends. (#1050)
* Add TODO items.
* rough, but something like this?
* wip - cleaning up some warnings, but more remain
* wip - reorganize
* wip - miss
Refactor `cuda-gen` and `hip-gen` backends. (#1050)
* Add TODO items.
* rough, but something like this?
* wip - cleaning up some warnings, but more remain
* wip - reorganize
* wip - missing kernels
* wip - replace t1d
* fix some kernels
* another typo
* more
* another one
* closer
* define T_1D
* typosgit add .!
* WIP: changes to cuda-shared framework for new kernels
* fix output writing
* buffer fix
* buffer sizes
* WIP: fixes for 2 and 3D basis kernels
* minor
* fix weight kernel for 3d
* remove debugging output
* minor reorg
* fix includes
* enable collo grad for cuda-shared
* move quoted kernels
* renaming
* missed a rename
* small fix
* more naming consistency
* faster 'useCollograd=false' path in *-gen
* more style
* one last style fix
* clearer collograd condition
* Add gen basis kernels to hip-shared
* Try some changes to hip-shared basis block sizes for new kernels
* cuda - drop extra kernel arg
* cuda - fix collograd check logic
* update gen comment about parallelization
* tidy up fields struct definition
* tidy up structs even more
* Update hip-gen basis templates use and move other hip-gen device functions to jit-source
* Finish hip-gen basis template update; small style updates to match CUDA
* missing isStrided
* Update block size used in 3D weight for new shared kernels
* update release notes
Co-authored-by: Jeremy L Thompson <jeremy@jeremylt.org>
Co-authored-by: nbeams <246972+nbeams@users.noreply.github.com>
show more ...
|