| 347d7e4c | 23-Jun-2020 |
Jeremy L Thompson <thompson.jeremy.luke@gmail.com> |
CUDA - add small leak fix in QFunction JiT load |
| 7df94212 | 23-Jun-2020 |
Jeremy L Thompson <thompson.jeremy.luke@gmail.com> |
CUDA - clean up includes |
| 73b3ccaf | 23-Jun-2020 |
Jeremy L Thompson <thompson.jeremy.luke@gmail.com> |
CUDA - clean up minor memory leak |
| 599c6f32 | 24-Jun-2020 |
Jeremy L Thompson <thompson.jeremy.luke@gmail.com> |
XSMM - switch to LIBXSMM 143409c, reverted redefiniton of _Bool |
| 0df135b4 | 22-Jun-2020 |
nbeams <246972+nbeams@users.noreply.github.com> |
comment out the hiprtc-related calls causing memory errors |
| 0131da11 | 19-Jun-2020 |
Jeremy L Thompson <thompson.jeremy.luke@gmail.com> |
XSMM - use v1.16 |
| 81a63d6f | 17-Jun-2020 |
nbeams <246972+nbeams@users.noreply.github.com> |
fix size of options list for hiprtcCompile |
| 9e9210b8 | 17-Jun-2020 |
Jeremy L Thompson <thompson.jeremy.luke@gmail.com> |
Op - add AssembleAdd version of diagonal assembly functions, will be helpful for MFEM integration |
| 2bba3ffa | 17-Jun-2020 |
Jeremy L Thompson <thompson.jeremy.luke@gmail.com> |
Op - change LinearAssemble* to accept CeedVector instead of pointer to CeedVector, allows for less memory movement and interfaces with parent code better |
| 30f4f45f | 16-Jun-2020 |
nbeams <246972+nbeams@users.noreply.github.com> |
Experimental HIP port of cuda-ref, round 2, with ROCm3.5 |
| 82253e48 | 15-Jun-2020 |
Jeremy L Thompson <thompson.jeremy.luke@gmail.com> |
make style |
| 80ac2e43 | 15-Jun-2020 |
Jeremy L Thompson <thompson.jeremy.luke@gmail.com> |
style - rename AssembleLinear* to LinearAssemble* |
| fd364f38 | 15-Jun-2020 |
Jeremy L Thompson <thompson.jeremy.luke@gmail.com> |
style - change Get*Status to Is* |
| c04a41a7 | 15-Jun-2020 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Composite Operator support for AssembleLinearDiag/PBDiag (#552)
* ref - add composite operator support for building diagonal/pb diagonal
* ref - add error for non-composite mult-field operator di
Composite Operator support for AssembleLinearDiag/PBDiag (#552)
* ref - add composite operator support for building diagonal/pb diagonal
* ref - add error for non-composite mult-field operator diagonal/pb diagonal assembly
* tap - add t538 exclusion because OCCA does not support galleries
* tests - adjust test cases for ceed examples for test coverage
* Op - fix documentation
show more ...
|
| b1d74153 | 12-Jun-2020 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
CUDA/MAGMA - add AssembleLinearQFunction (#553)
* CUDA - refactor operator apply for CUDA assemble linear QF impl
* CUDA/MAGMA - add AssembleLinearQFunction
* make style
* CUDA - clean up a
CUDA/MAGMA - add AssembleLinearQFunction (#553)
* CUDA - refactor operator apply for CUDA assemble linear QF impl
* CUDA/MAGMA - add AssembleLinearQFunction
* make style
* CUDA - clean up assembleLinearQF after q/e layout refactor
* CUDA - fallback operator for cuda/gen to cuda/ref
* CUDA - use delegation for cuda/gen prefered memtype
show more ...
|
| 49fd234c | 12-Jun-2020 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Convert CUDA ref/reg/shared E-Layout (#554)
* tests - update tests for multiple e-layouts
* CUDA - convert ref and reg restrictions to Q-layout
* CUDA - ref/reg/shared use gen/magma E-Layout f
Convert CUDA ref/reg/shared E-Layout (#554)
* tests - update tests for multiple e-layouts
* CUDA - convert ref and reg restrictions to Q-layout
* CUDA - ref/reg/shared use gen/magma E-Layout for multi elememnt basis apply and operator apply
* CUDA/MAGMA - drop eandqdiffer and separate MAGMA operator code
* CUDA - update operator comment
* reg - clarify read/write dofs/quads
* CUDA - drop dead code
show more ...
|
| d965c7a7 | 06-Jun-2020 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
CPU Impl of AssemblePointBlockDiagonal (#503)
* Operator - add AssemblePointBlockDiagonal for CPU backends, with test
* CUDA - add point block diagonal not supported message
* make style
*
CPU Impl of AssemblePointBlockDiagonal (#503)
* Operator - add AssemblePointBlockDiagonal for CPU backends, with test
* CUDA - add point block diagonal not supported message
* make style
* Operator - improve point block description
* OCCA - explicitly remove OCCA fallback to CPU assembly functions, will update after new OCCA backend
* Op - remove gap removal in point block diagonal
* Op - update diagonal assembly documentation
* Update backends/ref/ceed-ref-operator.c
Co-authored-by: Jed Brown <jed@jedbrown.org>
* style - fix extra space in * with nopad
Co-authored-by: Jed Brown <jed@jedbrown.org>
show more ...
|
| 59ae912b | 21-May-2020 |
Jeremy L Thompson <thompson.jeremy.luke@gmail.com> |
OCCA - small fixes highlighed by icc |
| 9a2291e3 | 19-May-2020 |
Jeremy L Thompson <thompson.jeremy.luke@gmail.com> |
CUDA - fix empty restriction support, must use GetStridedStatus over buggy indices pointer check |
| e0582403 | 15-May-2020 |
abdelfattah83 <36712794+abdelfattah83@users.noreply.github.com> |
Icl/magma queue (#524)
Update the MAGMA backend:
* add new specialized tensor basis kernels
* add batched DGEMM wrapper for non-tensor basis kernels
* switch backend to use MAGMA v2 inte
Icl/magma queue (#524)
Update the MAGMA backend:
* add new specialized tensor basis kernels
* add batched DGEMM wrapper for non-tensor basis kernels
* switch backend to use MAGMA v2 interface
Co-authored-by: nbeams <246972+nbeams@users.noreply.github.com>
Co-authored-by: Stan Tomov <tomov@eecs.utk.edu>
show more ...
|
| 9ff7e165 | 14-May-2020 |
Jed Brown <jed@jedbrown.org> |
cuda-gen: avoid gcc-10 warning on use of strncat |
| 20aaa365 | 08-May-2020 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
XSMM - fix q=1 index computation (#531)
* XSMM - use hash table for kernel index lookup, add khash to libCEED
* Hash - add CeedHashIJKLM to simplify xsmm tensor hash table code
* simplify use
XSMM - fix q=1 index computation (#531)
* XSMM - use hash table for kernel index lookup, add khash to libCEED
* Hash - add CeedHashIJKLM to simplify xsmm tensor hash table code
* simplify use of hash table, use kernels as values
* XSMM: more idiomatic use of khash
* make style
Co-authored-by: Jed Brown <jed@jedbrown.org>
show more ...
|
| a8c028e3 | 07-May-2020 |
Natalie Beams <246972+nbeams@users.noreply.github.com> |
CEED_STRIDES_BACKEND optimization for cuda-ref operator apply (#528)
* add check for backend stride status for input vectors
* add backend strides check for output vectors
* replace output cop
CEED_STRIDES_BACKEND optimization for cuda-ref operator apply (#528)
* add check for backend stride status for input vectors
* add backend strides check for output vectors
* replace output copy with elem restriction for none emode
* move input skip_restrict check to setup and never allocate E-vec if not needed
* add boolean variable for E/Q vector layout for
further optimization of output and add wrapper function in magma backend
to create a cuda-ref operator and change this state variable
* Add missing CeedChks
* style changes to better match cuda backends
* missed style change for evec check
* add CeedChk from PR #525 (merge conflict)
* make style changes
* adjust size of nqpts for non-tensor basis
show more ...
|
| 05ddf119 | 05-May-2020 |
Jeremy L Thompson <thompson.jeremy.luke@gmail.com> |
MAGMA - support empty restrictions |
| 274b8d22 | 05-May-2020 |
Jeremy L Thompson <thompson.jeremy.luke@gmail.com> |
CUDA - support empty restrictions |