| acb2c48c | 13-Mar-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
vec - add CeedVectorViewRange utility |
| 6d4d9f84 | 10-Mar-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #1174 from CEED/jeremy/operator-context
CeedOperatorContext* -> CeedOperator*Context* |
| 5fb68f37 | 08-Mar-2023 |
Karen (Ren) Stengel <karenlstengel@gmail.com> |
Adding CeedVectorCopy() and CeedVectorAXPBY() (#1170)
* adding CeedVectorAXPBY and CeedVectorCopy functions with Rust, Python, and CUDA/HIP backend support
---------
Co-authored-by: James Wrig
Adding CeedVectorCopy() and CeedVectorAXPBY() (#1170)
* adding CeedVectorAXPBY and CeedVectorCopy functions with Rust, Python, and CUDA/HIP backend support
---------
Co-authored-by: James Wright <james@jameswright.xyz>
Co-authored-by: Adeleke O. Bankole <86932837+AdelekeBankole@users.noreply.github.com>
Co-authored-by: Jed Brown <jed@jedbrown.org>
show more ...
|
| 17b0d5c6 | 07-Mar-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
style - rename CeedOperatorContext functions for consistency |
| 0126412d | 07-Mar-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
op - add CeedOperatorGetContext |
| b559d91b | 15-Feb-2023 |
Jed Brown <jed@jedbrown.org> |
ci: upgrade to clang-format-15 (#1157) |
| 3fc26417 | 06-Feb-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #1150 from CEED/jeremy/multi-assembly
CPU diagonal assembly for multiple active bases |
| de5900ad | 03-Feb-2023 |
James Wright <james@jameswright.xyz> |
operator: Get OperatorField by field name |
| 437c7c90 | 30-Jan-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
pc - CPU diagonal assembly for multiple active bases |
| 1f61f649 | 27-Jan-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #1148 from CEED/jeremy/operator-context-get
OperatorContextGet[Double, Int] |
| 2788fa27 | 24-Jan-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
ctx - Context[Get,Restore][Double,Int32]Read |
| 023b8a51 | 25-Jan-2023 |
abdelfattah83 <36712794+abdelfattah83@users.noreply.github.com> |
magma: non-tensor rtc (#1141)
* some refactoring in magma's jit src
* fix path
* fix loading src
* refactor magma nontensor backend
* refactor magma nontensor backend
* [WIP]: new non
magma: non-tensor rtc (#1141)
* some refactoring in magma's jit src
* fix path
* fix loading src
* refactor magma nontensor backend
* refactor magma nontensor backend
* [WIP]: new nontensor basis kernels
* [WIP]: new nontensor basis kernels
* [WIP]: new nontensor basis kernels
* call the new nontensor kernels for low order problems
* multiple compilation for the same kernels but with different tuning parmaters
* magma: allow different nb's for different non-tensor kernels
* tuning data for the non-tensor rtc kernels
* remove no-longer used functions, add new one for tuning the nontensor kernels
* constants for tuning
* tuning functions
* use the tuning functions in compiling/running the new kernels
* bug fix
* fixes
* fixes
* minor
* switch tuning data
* fix name
* fix name
* add function to run cuda kernels with opt-in shared memory feature
* minor fix
* minor fix
* fix calls to batch api
* allow more kernel instances
* temporary timing function
* temporary timing function
* tuning data based on hiprtc
* rollback tuning parameters
* fixes
* fixes
* fix inconsistency in the parameters passed to nvrtc/hiprtc
* minor
* a fix to the nb selector
* cleanup
* merge the opt-in feature in CeedRunKernelDimSharedOptinCuda into CeedRunKernelDimSharedCuda
* fix paths for hip-magma backends
* style
* fixes
* running make format
* undo changes from the last commit
* change HIP_DIR to ROCM_DIR and adjust the paths for magma accordingly
* replace HIP_DIR with ROCM_DIR
show more ...
|
| 8ec64e9a | 24-Dec-2022 |
Jed Brown <jed@jedbrown.org> |
libCEED 0.11.0 |
| c6ebc35d | 30-Nov-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
op - name consistency for composite operator fns |
| 75f0d5a4 | 30-Nov-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
op - add CeedCompositeOperatorGetMultiplicity |
| ea61e9ac | 30-Nov-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
minor - assorted formatting fixes |
| 47fa654b | 18-Nov-2022 |
Jed Brown <jed@jedbrown.org> |
style: remove obsolete (with clang-format) INDENT comments |
| 2b730f8b | 17-Nov-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Switch to clang-format (#1051)
* style - switch to clang-format
* ci - use newer libxsmm
* action - update format action
* format - consistent use of {} for multi-line if/for
* make - re
Switch to clang-format (#1051)
* style - switch to clang-format
* ci - use newer libxsmm
* action - update format action
* format - consistent use of {} for multi-line if/for
* make - remove stray newline
* make - simpler 'make format' target
* ci - use newer libxsmm
* doc - minor release note claification
* minor - minor fix
* minor - minor fix
* minor - minor fix
* minor - minor fix
* make format
* format - less aggressive alignment rules
* tidy - check for argument name mismatches
* fix newline
* format - mirror Ratel update to .clang-format
* fix merge error
* fix merge conflict
* fix merge error
* drop style in .phony list
* Update .clang-format
Co-authored-by: Jed Brown <jed@jedbrown.org>
* apply updated format
Co-authored-by: Jed Brown <jed@jedbrown.org>
show more ...
|
| 9e201c85 | 23-Sep-2022 |
Yohann <dudouit1@llnl.gov> |
Refactor `cuda-gen` and `hip-gen` backends. (#1050)
* Add TODO items.
* rough, but something like this?
* wip - cleaning up some warnings, but more remain
* wip - reorganize
* wip - miss
Refactor `cuda-gen` and `hip-gen` backends. (#1050)
* Add TODO items.
* rough, but something like this?
* wip - cleaning up some warnings, but more remain
* wip - reorganize
* wip - missing kernels
* wip - replace t1d
* fix some kernels
* another typo
* more
* another one
* closer
* define T_1D
* typosgit add .!
* WIP: changes to cuda-shared framework for new kernels
* fix output writing
* buffer fix
* buffer sizes
* WIP: fixes for 2 and 3D basis kernels
* minor
* fix weight kernel for 3d
* remove debugging output
* minor reorg
* fix includes
* enable collo grad for cuda-shared
* move quoted kernels
* renaming
* missed a rename
* small fix
* more naming consistency
* faster 'useCollograd=false' path in *-gen
* more style
* one last style fix
* clearer collograd condition
* Add gen basis kernels to hip-shared
* Try some changes to hip-shared basis block sizes for new kernels
* cuda - drop extra kernel arg
* cuda - fix collograd check logic
* update gen comment about parallelization
* tidy up fields struct definition
* tidy up structs even more
* Update hip-gen basis templates use and move other hip-gen device functions to jit-source
* Finish hip-gen basis template update; small style updates to match CUDA
* missing isStrided
* Update block size used in 3D weight for new shared kernels
* update release notes
Co-authored-by: Jeremy L Thompson <jeremy@jeremylt.org>
Co-authored-by: nbeams <246972+nbeams@users.noreply.github.com>
show more ...
|
| 228d9efb | 24-Aug-2022 |
James Wright <james@jameswright.xyz> |
ceed: Add CEED_QFUNCTION_ATTR for inlining
GCC doesn't like to inline all Qfunction helpers, so this forces it do so if inlining is allowed at all. |
| c9c2c079 | 05-Aug-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
QF headers for typedefs and macros (#1036)
* jit - qf headers for typedefs and macros
* jit - smaller list of permitted files
* ceed - only include ceed.h in QF source |
| a76a04e7 | 07-Jul-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
basis - make CreateProjectionMatrix internal fn |
| 151157ab | 07-Jul-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
doc - fix CEED_BASIS_COLLOCATED documentation |
| 990fdeb6 | 21-Jun-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
fmt - add CeedInt_FMT |
| 121d4b7f | 27-Jun-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gallery - fix bad indexing in 3d poission det |