| 0a7bea2d | 28-Aug-2020 |
Will Pazner <will.e.p@gmail.com> |
Add SetCUDAUserFunction to /gpu/cuda/ref backend |
| 2a8ae808 | 17-Sep-2020 |
nbeams <246972+nbeams@users.noreply.github.com> |
Merge branch 'main' into icl/hip-magma |
| 461525f5 | 17-Sep-2020 |
Natalie Beams <246972+nbeams@users.noreply.github.com> |
Consolidate CUDA backends (#623)
* Travis - allow icc failure for now
* move cuda-reg basis init kernels to cuda-shared
* move cuda-reg restrictions to cuda-ref
* change delegate ceeds for
Consolidate CUDA backends (#623)
* Travis - allow icc failure for now
* move cuda-reg basis init kernels to cuda-shared
* move cuda-reg restrictions to cuda-ref
* change delegate ceeds for previous uses of cuda-reg
* remove cuda-reg backend
* update hip restrictions to match cuda
* update backends list in README
* make style
* update release notes for removal of cuda-reg
Co-authored-by: jeremylt <thompson.jeremy.luke@gmail.com>
show more ...
|
| 18d499f1 | 17-Sep-2020 |
Yohann <dudouit1@llnl.gov> |
Enable under-integration for cuda-shared and cuda-gen backends (#620)
* Support under integration in cuda-shared.
* Add under-integration to the cuda-gen backend.
* Fix bugs when under-integ i
Enable under-integration for cuda-shared and cuda-gen backends (#620)
* Support under integration in cuda-shared.
* Add under-integration to the cuda-gen backend.
* Fix bugs when under-integ in cuda-shared.
* Factor some code.
* Factor some code in cuda-gen.
* Guard more carefully.
* Introduce T1d.
* Fix a bug in readQuads3d
* Fix bugs in 3D.
* Fix a typo
* Safety init.
* Try something with ContractZ3d.
* Guard the add
* revert add.
* Add more thread guards
* Same as previous
* Fix a bug in add.
* style.
* Check that the bases are tensor in cuda-gen.
* move isTensor
* Add T1d to cuda-gen and guard contractions.
* Fix typos.
* add guards in 1d.
* Rewrite weight functions.
* typo
* CUDA - fix cuda-gen collocated check
* make style.
Co-authored-by: jeremylt <thompson.jeremy.luke@gmail.com>
show more ...
|
| 59f7e599 | 08-Sep-2020 |
nbeams <246972+nbeams@users.noreply.github.com> |
move device kernels to common folder |
| 969f2b10 | 02-Sep-2020 |
nbeams <246972+nbeams@users.noreply.github.com> |
Add HIP support for MAGMA backend |
| 64d3f0c0 | 26-Aug-2020 |
jeremylt <thompson.jeremy.luke@gmail.com> |
Cuda - improve variable name clarity |
| 75c7b208 | 25-Aug-2020 |
jeremylt <thompson.jeremy.luke@gmail.com> |
CUDA - drop double negation |
| 0f54b25e | 25-Aug-2020 |
jeremylt <thompson.jeremy.luke@gmail.com> |
CUDA - clean up logic for collograd in cuda/gen, should be based on all bases with interp/grad |
| 6c845298 | 25-Aug-2020 |
jeremylt <thompson.jeremy.luke@gmail.com> |
CUDA - drop extra casts |
| c8ed46e2 | 25-Aug-2020 |
Yohann <dudouit1@llnl.gov> |
Merge branch 'main' into yohann/fix-cuda-gen |
| 792ff326 | 25-Aug-2020 |
Yohann Dudouit <dudouit1@llnl.gov> |
Access the restriction before using it. |
| 9647a07e | 22-Aug-2020 |
David Medina <dmed256@gmail.com> |
OCCA Backend Update (#305)
* [Docs] Update docs for new OCCA backend
* [Fortran] CeedVectorRestoreArray fix
* [Test] Updates t002-ceed test to support query params
* [Make] Adds tidy for cp
OCCA Backend Update (#305)
* [Docs] Update docs for new OCCA backend
* [Fortran] CeedVectorRestoreArray fix
* [Test] Updates t002-ceed test to support query params
* [Make] Adds tidy for cpp files
* [OCCA] Updates OCCA backend
* PR feedback: Update StrideType naming
* PR feedback: Fixed USER_STRIDES vs BACKEND_STRIDES usage
* [OCCA] Add comments to code generation
* [CI] Removes commit from OCCA build
* PR Feedback: Update README
* PR Feedback: Removed compiler warnings
* Fix restriction function changes
* occa: track AssembleLinear -> LinearAssemble
* [OCCA][Vector] Adds takeArray method
* [OCCA][Restriction] Fixes restriction strides
* [OCCA][Operator] Adds point block diagonal registration
* [OCCA][Operator] Fixes
* OCCA - update debug color for visability
* Travis - fix extra fi
* OCCA - adjust kernel, multi component derivatives are expected as [dim][comp][q]
* OCCA - adjust basis kernel args in operator kernel to agree with expected [dim][comp][q] ordeing
* OCCA - fix uninitalized memory in grad transpose 3d kernel
* OCCA - fix Elayout description
* OCCA - fix bad dimensions in basis kernel
* OCCA - fix TakeArray to sync before returning array pointer
* MFEM - print error when test fails
* OCCA - fix 2d grad kernels
* OCCA - flag digonal of composite operators unsupported
* OCCA - fix restoreArray logic
* OCCA - minor cleanup with GPU
* Travis - add 'make info' where able for debugging
* OCCA - explictily test OpenCL mode
* OCCA - drop restrict for ElemRestriction kernels, OpenCL doesn't like it
* OCCA - explicitly test cuda and hip versions of OCCA backend
* OCCA - explicitly test OpenMP mode in OCCA
* Tests - modify check for VLA support for OCCA to catch all OCCA modes
* WIP - test possible OCCA fix for PowerPC
* OCCA - separate CPU modes for testing as well
* Readme - update list of OCCA backend modes
* Makefile - fix unterminated addprefix
* OCCA - enable direct access to OCCA Serial mode
* OCCA - add comments to registration
* Makefile - remove extra )
* OCCA - remove pass by reference C++ syntax for OpenCL compatability
* OCCA - drop use of @restrict for OpenCL
* OCCA - remove OpenCL mode, not fully supported in OCCA (see OCCA issue #166)
* OCCA - fixing rebase issues
* OCCA - Fix implementation of QFunctionContext
* OCCA - move GetContextSize so ierr check actually works
* Travis - use libOCCA instead of jeremylt/occa
* Junit - update OCCA test skip list
* Make - simplify OCCA check for enabled modes
Co-authored-by: Jed Brown <jed@jedbrown.org>
Co-authored-by: Jeremy L. Thompson <jeremy.thompson@colorado.edu>
show more ...
|
| 3069e47f | 20-Aug-2020 |
jeremylt <thompson.jeremy.luke@gmail.com> |
Hip - shorten up resource strncmp to remove requirement for trailing slash with /gpu/hip |
| 777ff853 | 14-Aug-2020 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
QFunction Context Data Object (#596)
* Ctx - create context object for QFunction context data
* Context - rename UserContext -> QFunctionContext
* Ctx - add lcov markers
* Ctx - fix leak in
QFunction Context Data Object (#596)
* Ctx - create context object for QFunction context data
* Context - rename UserContext -> QFunctionContext
* Ctx - add lcov markers
* Ctx - fix leak in identity QFunctions
* Hip/Cuda - rename sync functions for vector/context
* Tests - lcov marker update
* QFunction - drop unused function
* Python - fix copy-paste errors
* Ctx - update notes for Fortran usage
* Fortran - drop unneeded cast
Co-authored-by: Jed Brown <jed@jedbrown.org>
* Interface - use void* for SetData interfaces
* Make - use call quiet for NVCC
* Interface - use void* for GetData interfaces
* Make - add quiet call option for examples
* Makefile - create common makefile to reduce duplication/complexity in example makefiles
Co-authored-by: Jed Brown <jed@jedbrown.org>
show more ...
|
| e299b378 | 10-Aug-2020 |
jeremylt <thompson.jeremy.luke@gmail.com> |
Hip - add missing ierr |
| 0f09838f | 29-Jul-2020 |
jeremylt <thompson.jeremy.luke@gmail.com> |
Cuda/Hip - name QFunctions for easier profiling |
| 29b67289 | 29-Jul-2020 |
jeremylt <thompson.jeremy.luke@gmail.com> |
Hip - fix warning about snprintf |
| 752c3701 | 28-Jul-2020 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Fix CodeCov Reports (#597)
* Tests - use qfunction headers for Fortran tests to improve bypass gcov issue
* Gitlab - use latest gcc on Noether
* Cuda/Hip - add case in reciprocal for completen
Fix CodeCov Reports (#597)
* Tests - use qfunction headers for Fortran tests to improve bypass gcov issue
* Gitlab - use latest gcc on Noether
* Cuda/Hip - add case in reciprocal for completeness
* Cuda - remove duplicate case
* Makefile - exclude fortran test headers from make style
* Travis - update to Focal
* Cov - adjust style to be consistent and avoid false misses
* Travis - update comments and style
show more ...
|
| d99fa3c5 | 28-Jul-2020 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Op - add interface for multigrid level creation (#579)
* Op - add interface for multigrid level creation
* Op - add implementation for OperatorMultigridLevelCreate
* make style
* make tidy
Op - add interface for multigrid level creation (#579)
* Op - add interface for multigrid level creation
* Op - add implementation for OperatorMultigridLevelCreate
* make style
* make tidy
* Op - add test t550, fix errors
* Tests - add Fortran version of t550
* Tests - add t511 for testing tensor basis multigrid level setup
* make style and tidy
* Tests - fix t55* memory leaks
* Tests - add t552 for non-tensor basis multigrid levels
* CUDA - use CeedIntMax in shared CUDA backend
* Tests - add OCCA test exception for t55*
* Op - add lvector global prolongation multiplicity, simplifies user interface
* Solids - convert example to new interface
* make style
* Tests - convert t550 to multicomponent
* Solids - drop unused ceed_fine
* Python - add new multigrid level interface
* Python - fix operator wrap, use ceed python obj rather than ceed pointer
* Gallery - update comment slightly
* Tests - remove accidental duplicate test
* Multigrid - add size=2 case as well
* Operator - drop unneeded inline
* QFunction - simplify context ownership to match vector
* make style
* Python - update multigrid function signature
* Operator - refactor prolong/restrict qfunctions as scaling qfunctions
* Vector - add testing for reciprocal and add to Fortran/Python interfaces
* CUDA - add VectorReciprocal on device
* Gallery - drop specalized versions for 'Scale', wil fix performance hit later
* Hip - add vector reciprocal
* Operator - add more flexible prologation basis creation interface
* Vec - make sure data is set for VectorReciprocal
* Tests - drop ncomp for t550/1 so kernel is not too large for Magma backend
* Tests - add missing lcov markers
* make style
* Travis - allow ARM job to fail
* Travis - fix intel install
* Travis - try different install dir name for inteloneapi
* Travis - add ifort, ipp packages
* Tests - add missing lcov marker
show more ...
|
| a1766732 | 27-Jul-2020 |
Jeremy L Thompson <thompson.jeremy.luke@gmail.com> |
Cuda - add cublasGetErrorName exclusion to match hip |
| 6bbcfef4 | 27-Jul-2020 |
Jeremy L Thompson <thompson.jeremy.luke@gmail.com> |
Hip/Cuda - expand QFunction LCOV exception for failing to open qf source file |
| b2573fe1 | 27-Jul-2020 |
Jeremy L Thompson <thompson.jeremy.luke@gmail.com> |
Device - put device kernels in separate 'kernels' folder in backends |
| e9f4dca0 | 27-Jul-2020 |
Jeremy L Thompson <thompson.jeremy.luke@gmail.com> |
Cuda - add LCOV markers |
| 34f6cd3f | 27-Jul-2020 |
Jeremy L Thompson <thompson.jeremy.luke@gmail.com> |
Hip - add LCOV markers |