| #
f99981a3
|
| 25-Feb-2022 |
Jed Brown <jed@jedbrown.org> |
Merge pull request #893 from CEED/natalie/more-hip-launch-bounds
HIP: add atomics flag and more kernel launch bounds for performance improvements
|
| #
b3c5430c
|
| 01-Feb-2022 |
nbeams <246972+nbeams@users.noreply.github.com> |
Add flag to use atomic adds on supported AMD GPU hardware
|
| #
9faa5937
|
| 19-Jan-2022 |
Natalie Beams <246972+nbeams@users.noreply.github.com> |
Slight modifications for hiprtc usage in ROCm 4.5 (#850)
|
| #
d92fedf5
|
| 22-Dec-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #863 from CEED/jeremy/gpu-jit-code
GPU - separate common code into separate folder
|
| #
0d0321e0
|
| 22-Dec-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
style - consistent nameing and style for gpu backends
|
| #
7fcac036
|
| 22-Dec-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - split common cuda/hip data into separate folder
|
| #
80a9ef05
|
| 02-Sep-2021 |
Natalie Beams <246972+nbeams@users.noreply.github.com> |
Allow CeedScalar to be single precision (#788)
One can modify `ceed.h` to include `ceed-f32.h` and then use single precision. This is tested for C in CI and has been tested by developers with Rust,
Allow CeedScalar to be single precision (#788)
One can modify `ceed.h` to include `ceed-f32.h` and then use single precision. This is tested for C in CI and has been tested by developers with Rust, Julia, and Python. This interface is evolving and should be considered experimental at this time (thus lack of automated build support).
* Introduce CeedScalarType enum
* WIP changes to allow different definitions of CeedScalar
* Introduce new header files for float and double
* Only use avx tensor contract and MAGMA non-tensor basis if CeedScalar is double
* WIP changes to allow CeedScalar to be float
* WIP start trying to adjust test tolerances for float or double
* fix typos in comments
* install ceed-f32/64 headers
* Fix missing casts for hipMAGMA element restrictions
* make CeedQFunctionContextGetContextSize available for Python bindings
* Changes to Python bindings to allow CeedScalar to be float
* WIP adjust Python tests for float or double
* make style
* remove QFunctionContextGetContextSize from backend header
* Use quotes instead of <> in include statement
* Remove unncessary includes
* Update tolerances for tests
* [Julia] allow CeedScalar to be Float32
* [Julia] Use Preferences instead of custom build configuration
# Conflicts:
# julia/LibCEED.jl/src/C.jl
* [Makefile] Change definition of CC_VENDOR so it works with cross-compilation
* [Julia] Use Preferences in CI
# Conflicts:
# .github/workflows/julia-test-with-style.yml
* [Julia] Update docs about preferences
* [Julia] Add test/Project.toml workaround for Preferences
* Add CeedGetScalarType to get the type of CeedScalar at runtime
* [Julia] Move functions from Ceed.jl to LibCEED.jl
* [Julia] Add support for getting library path and scalar type at runtime
* [Julia] Minor change to checking if CUDA is loaded
* [Julia] Check correct CeedScalar types in basis functions
* [Julia] Fix tests comparing with output file
* [Julia] Change devtests to use CeedScalar instead of Float64
* Update test 402 so context will be same size in double or float
* Update tolerances for ceed examples
* [Julia] CUDA fixes
* remove unused variable in t208
* SchurDecomposition: do not compute tau on final iteration
* Update tolerances for some basis tests (for single precision)
* Make style
* Python style fixes for basis test
* Add single precision output for t300 and t320 and adjust checks; skip t541 in single
* Add LCOV exclusions after moving to new line
* fix spacing
* Python: make CEED_EPSILON available as libceed.EPSILON
* Python: optional parameter to specify different output file for test comparison
* Python: update tests' use of EPSILON and change test_300 output file for single precision
* Python: add convenience function for getting dtype corresponding to CeedScalar
* rust - add single precision support
* [Julia] Fall back on Float64 if CeedGetScalarType is not available
* [Julia] style
* Adjust tolerance for t301
* xsmm - add single precision support
* avx - add single precision support
* Add initial single precision support for MAGMA non-tensor basis
* Skip t300 and t320 in single precision; revert Python t300 changes
* Revert output changes for t300 and t320 in junit
* [Julia] Changes to autogenerated bindings for mixed precision
* [Julia] style
* [Julia] Check scalar type when changing libceed library path
The check is also performed when the package is loaded. This prevents having to
restart the Julia session twice
* [Julia] Require JLLWrappers version 1.3
This is needed to use Preferences to change the library path
* Add documentation page for precision development
Co-authored-by: Will Pazner <will.e.p@gmail.com>
* Cleanup from merge: remove old README
* Return CEED_ALIGN to backend.h
* Make Fortran compiler (FC) optional; empty skips Fortran tests
Use in Python and Rust builds, which may not have a Fortran compiler
installed and thus would produce confusing output.
* Add single precision CI test for Noether
Co-authored-by: Jeremy L Thompson <jeremy@jeremylt.org>
Co-authored-by: Will Pazner <will.e.p@gmail.com>
Co-authored-by: Jeremy L Thompson <jeremy@jeremylt.org>
Co-authored-by: Jed Brown <jed@jedbrown.org>
show more ...
|
| #
972b3d9d
|
| 27-May-2021 |
Natalie Beams <246972+nbeams@users.noreply.github.com> |
Minor fixes in backends/hip and backends/magma (#771)
* fix typo in ceed-magma header def
* Change setting of gcnArchName option to avoid string overflow
|
| #
91517767
|
| 21-May-2021 |
Jed Brown <jed@jedbrown.org> |
Merge pull request #767 from CEED/update-hiprtc-arch
Update GPU arch argument for hiprtc
|
| #
0ea97b06
|
| 20-May-2021 |
nbeams <246972+nbeams@users.noreply.github.com> |
Update GPU arch argument for hiprtc
|
| #
874019bc
|
| 31-Mar-2021 |
Jed Brown <jed@jedbrown.org> |
Merge pull request #716 from CEED/jed/install-backend.h
Jed/install backend.h
|
| #
ec3da8bc
|
| 26-Mar-2021 |
Jed Brown <jed@jedbrown.org> |
Install install backend headers under include/ceed/
This makes it possible to distribute source plugins that provide additional backends. It's also used in MFEM, perhaps temporarily.
Deprecate ceed
Install install backend headers under include/ceed/
This makes it possible to distribute source plugins that provide additional backends. It's also used in MFEM, perhaps temporarily.
Deprecate ceed-backend.h, which was not previously installed, but some users accessed it from an in-place build.
Also install CUDA and HIP headers that allow users to provide CUfunction and hipFunction_t.
Co-authored-by: Jeremy L. Thompson <jeremy.thompson@colorado.edu> Requested-by: Andrew T. Barker <barker29@llnl.gov>
show more ...
|
| #
e15f9bd0
|
| 20-Mar-2021 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Error Handling Improvement [fix #696] (#691)
* Operator - add operator/qfunction field compatibility checks
* QFunction - do not allow adding fields to QFunction in use with an operator
* Exam
Error Handling Improvement [fix #696] (#691)
* Operator - add operator/qfunction field compatibility checks
* QFunction - do not allow adding fields to QFunction in use with an operator
* Examples - add some extra exclusion markers in ceed example
* error - add error enum
* error - update error enum names and numbering
* error - use CEED_ERROR_BACKEND in all backend errors
* error - begin classifying interface errors
* error - update backends to use CEED_ERROR_SUCCESS and CeedChkBackend
* error - use new errors in gallery
* error - add some unsaved modifications
* error - improve documentation
* error - define CEED_ERROR_SUCCESS in GPU JiT; we really should have a common header to pipe defines to the JiT code
* error - more error code editing
* error - fix error string
* operator - fix setting field qpts
* basis - add input/output dimension error checking
* python - move basis utility methods to ceed object, no basis required or used
* python - force exit with negative error code
* make style-py
* rust - initial work to add error handling logic
* rust - add ceed.resource method
* rust - add results for methods that may fail
* rust - also format doctests
* minor - drop unused CeedChk()
* error - rename terminal/nonterminal to major/minor
* rust - set ErrorStore as default errorhandler
* python - revert error handing change for python
* python - use success error code from C bindings
* error - only upgrade error code in backend if positive
show more ...
|
| #
3d576824
|
| 29-Jan-2021 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
headers - clearify includes to not rely on transitive includes (#701)
* headers - clearify includes to not rely on transitive includes
* style - add header recommendations from 'include-what-you-
headers - clearify includes to not rely on transitive includes (#701)
* headers - clearify includes to not rely on transitive includes
* style - add header recommendations from 'include-what-you-use'
* style - apply 'include-what-you-use' changes to CUDA backends
* style - 'include-what-you-use' for hip backends
* style - drop ceed.h includes in gallery qf source
* docs - add dev notes for header files
* style - header style and alphabetize
show more ...
|
| #
fe5822c7
|
| 29-Jul-2020 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Merge pull request #598 from CEED/jeremy/cuda-small-fixes
Small Cuda/Hip Fixes
|
| #
29b67289
|
| 29-Jul-2020 |
jeremylt <thompson.jeremy.luke@gmail.com> |
Hip - fix warning about snprintf
|
| #
a85a7fae
|
| 24-Jul-2020 |
Jed Brown <jed@jedbrown.org> |
Merge pull request #591 from CEED/natalie/hip-ref-v2
Add hip-ref backend
|
| #
17fed040
|
| 15-Jul-2020 |
nbeams <246972+nbeams@users.noreply.github.com> |
update with changes from cuda-ref
|
| #
0df135b4
|
| 22-Jun-2020 |
nbeams <246972+nbeams@users.noreply.github.com> |
comment out the hiprtc-related calls causing memory errors
|
| #
81a63d6f
|
| 17-Jun-2020 |
nbeams <246972+nbeams@users.noreply.github.com> |
fix size of options list for hiprtcCompile
|
| #
30f4f45f
|
| 16-Jun-2020 |
nbeams <246972+nbeams@users.noreply.github.com> |
Experimental HIP port of cuda-ref, round 2, with ROCm3.5
|