| #
a20f00ad
|
| 03-Nov-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #1085 from CEED/jeremy/kernel-name
|
| #
204bfdd7
|
| 03-Nov-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
minor - clearer GPU QF/Op kernel names
|
| #
9e201c85
|
| 23-Sep-2022 |
Yohann <dudouit1@llnl.gov> |
Refactor `cuda-gen` and `hip-gen` backends. (#1050)
* Add TODO items.
* rough, but something like this?
* wip - cleaning up some warnings, but more remain
* wip - reorganize
* wip - miss
Refactor `cuda-gen` and `hip-gen` backends. (#1050)
* Add TODO items.
* rough, but something like this?
* wip - cleaning up some warnings, but more remain
* wip - reorganize
* wip - missing kernels
* wip - replace t1d
* fix some kernels
* another typo
* more
* another one
* closer
* define T_1D
* typosgit add .!
* WIP: changes to cuda-shared framework for new kernels
* fix output writing
* buffer fix
* buffer sizes
* WIP: fixes for 2 and 3D basis kernels
* minor
* fix weight kernel for 3d
* remove debugging output
* minor reorg
* fix includes
* enable collo grad for cuda-shared
* move quoted kernels
* renaming
* missed a rename
* small fix
* more naming consistency
* faster 'useCollograd=false' path in *-gen
* more style
* one last style fix
* clearer collograd condition
* Add gen basis kernels to hip-shared
* Try some changes to hip-shared basis block sizes for new kernels
* cuda - drop extra kernel arg
* cuda - fix collograd check logic
* update gen comment about parallelization
* tidy up fields struct definition
* tidy up structs even more
* Update hip-gen basis templates use and move other hip-gen device functions to jit-source
* Finish hip-gen basis template update; small style updates to match CUDA
* missing isStrided
* Update block size used in 3D weight for new shared kernels
* update release notes
Co-authored-by: Jeremy L Thompson <jeremy@jeremylt.org>
Co-authored-by: nbeams <246972+nbeams@users.noreply.github.com>
show more ...
|
| #
dc64899e
|
| 06-Sep-2022 |
Yohann <dudouit1@llnl.gov> |
Change the initialization logic for `useCollograd`. (#1021)
* Change the initialization logic for `useCollograd`.
* Guard useCollograd for 3D only.
* Propagate `useCollograd` change to `hip-ge
Change the initialization logic for `useCollograd`. (#1021)
* Change the initialization logic for `useCollograd`.
* Guard useCollograd for 3D only.
* Propagate `useCollograd` change to `hip-gen`.
* Update backends/hip-gen/ceed-hip-gen-operator-build.cpp
* Propagate changes to `hig-gen`.
* Revert redimensioning of `r_tt`.
Co-authored-by: Jed Brown <jed@jedbrown.org>
show more ...
|
| #
c9c2c079
|
| 05-Aug-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
QF headers for typedefs and macros (#1036)
* jit - qf headers for typedefs and macros
* jit - smaller list of permitted files
* ceed - only include ceed.h in QF source
|
| #
e8001fe0
|
| 07-Jul-2022 |
Jed Brown <jed@jedbrown.org> |
Merge pull request #1009 from CEED/jrwrigh/dirichlet_with_libceed
Fluids - Use libCEED to compute Dirichlet boundary conditions
|
| #
3b0d37b7
|
| 07-Jul-2022 |
Jed Brown <jed@jedbrown.org> |
{cuda,hip}/gen: fix incorrect quadrature points when all bases are collocated
https://github.com/CEED/libCEED/pull/1009#issuecomment-1176751436
Co-authored-by: Natalie Beams <nbeams@icl.utk.edu>
|
| #
c9d492da
|
| 23-Jun-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #1006 from CEED/jeremy/gen-all-collo-fix
Fix /gpu/*/gen backends for op with all CEED_BASIS_COLLOCATED
|
| #
1d47fde2
|
| 22-Jun-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - fix /gpu/*/gen backends for op with all CEED_BASIS_COLLOCATED
|
| #
ce18bed9
|
| 17-Mar-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #858 from CEED/jeremy/dump-copy-stuff
Strip redundant/outdated license info duplication
|
| #
3d8e8822
|
| 17-Mar-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
minor - update copyright headers
|
| #
60224bc5
|
| 14-Mar-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #913 from CEED/jeremy/coo-ptrdiff
Create CeedSize as ptrdiff_t
|
| #
e79b91d9
|
| 11-Mar-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
rstr - use CeedSize for l_size
|
| #
51d630a3
|
| 24-Dec-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #864 from CEED/jeremy/gpu-templates
GPU - pull quoted kernels into separate files
|
| #
46dc0734
|
| 23-Dec-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - improved human-readability of debugging output
|
| #
437930d1
|
| 22-Dec-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - pull quoted kernels into separate files
|
| #
d92fedf5
|
| 22-Dec-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #863 from CEED/jeremy/gpu-jit-code
GPU - separate common code into separate folder
|
| #
0d0321e0
|
| 22-Dec-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
style - consistent nameing and style for gpu backends
|
| #
7fcac036
|
| 22-Dec-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - split common cuda/hip data into separate folder
|
| #
6d69246a
|
| 21-Dec-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
cuda - separate compile functionality into new header
|
| #
d0dee30e
|
| 19-Nov-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #840 from CEED/jeremy/env-debug
Macro for Debug without Ceed Context
|
| #
3f21f6b1
|
| 10-Nov-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
debug - create CeedDebugEnv macro, refactor CeedDebug macro
|
| #
743f4ebb
|
| 28-Sep-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #811 from CEED/jeremy/some-caching
Store Objects For QFunction Reassembly
|
| #
88db6d3b
|
| 15-Sep-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
tidy - minor fixes in CUDA
|
| #
0b548709
|
| 14-Sep-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #806 from CEED/jeremy/get-fields
Promote Field Getters to Public API
|