| 49aac155 | 24-Mar-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
IWYU fixes (#1182)
* iwyu - include fixes
* iwyu - silence some iwyu output
* minor - clearer macro names
* iwyu - fix suggestion of "ceed/ceed.h" externally
* iwyu - lighter petsc heade
IWYU fixes (#1182)
* iwyu - include fixes
* iwyu - silence some iwyu output
* minor - clearer macro names
* iwyu - fix suggestion of "ceed/ceed.h" externally
* iwyu - lighter petsc headers
* iwyu - ceed/ceed.h -> ceed.h
* iwyu - cuda/hip include fixes
show more ...
|
| 2b730f8b | 17-Nov-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Switch to clang-format (#1051)
* style - switch to clang-format
* ci - use newer libxsmm
* action - update format action
* format - consistent use of {} for multi-line if/for
* make - re
Switch to clang-format (#1051)
* style - switch to clang-format
* ci - use newer libxsmm
* action - update format action
* format - consistent use of {} for multi-line if/for
* make - remove stray newline
* make - simpler 'make format' target
* ci - use newer libxsmm
* doc - minor release note claification
* minor - minor fix
* minor - minor fix
* minor - minor fix
* minor - minor fix
* make format
* format - less aggressive alignment rules
* tidy - check for argument name mismatches
* fix newline
* format - mirror Ratel update to .clang-format
* fix merge error
* fix merge conflict
* fix merge error
* drop style in .phony list
* Update .clang-format
Co-authored-by: Jed Brown <jed@jedbrown.org>
* apply updated format
Co-authored-by: Jed Brown <jed@jedbrown.org>
show more ...
|
| 9e201c85 | 23-Sep-2022 |
Yohann <dudouit1@llnl.gov> |
Refactor `cuda-gen` and `hip-gen` backends. (#1050)
* Add TODO items.
* rough, but something like this?
* wip - cleaning up some warnings, but more remain
* wip - reorganize
* wip - miss
Refactor `cuda-gen` and `hip-gen` backends. (#1050)
* Add TODO items.
* rough, but something like this?
* wip - cleaning up some warnings, but more remain
* wip - reorganize
* wip - missing kernels
* wip - replace t1d
* fix some kernels
* another typo
* more
* another one
* closer
* define T_1D
* typosgit add .!
* WIP: changes to cuda-shared framework for new kernels
* fix output writing
* buffer fix
* buffer sizes
* WIP: fixes for 2 and 3D basis kernels
* minor
* fix weight kernel for 3d
* remove debugging output
* minor reorg
* fix includes
* enable collo grad for cuda-shared
* move quoted kernels
* renaming
* missed a rename
* small fix
* more naming consistency
* faster 'useCollograd=false' path in *-gen
* more style
* one last style fix
* clearer collograd condition
* Add gen basis kernels to hip-shared
* Try some changes to hip-shared basis block sizes for new kernels
* cuda - drop extra kernel arg
* cuda - fix collograd check logic
* update gen comment about parallelization
* tidy up fields struct definition
* tidy up structs even more
* Update hip-gen basis templates use and move other hip-gen device functions to jit-source
* Finish hip-gen basis template update; small style updates to match CUDA
* missing isStrided
* Update block size used in 3D weight for new shared kernels
* update release notes
Co-authored-by: Jeremy L Thompson <jeremy@jeremylt.org>
Co-authored-by: nbeams <246972+nbeams@users.noreply.github.com>
show more ...
|
| 6a5027c1 | 21-Jun-2022 |
Jed Brown <jed@jedbrown.org> |
Merge pull request #1003 from CEED/release
Update main with bugfix |
| b11824b3 | 21-Jun-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - fix setting device id |
| ee5a26f2 | 04-Apr-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
jit - add interface for adding additional jit source dirs |
| a0154ade | 04-Apr-2022 |
Jed Brown <jed@jedbrown.org> |
move include/ceed-jit-source to include/ceed/jit-source |
| 6eb0d8b4 | 01-Apr-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
jit - use relpath from include/ceed-jit-source for jit source files |
| 3d8e8822 | 17-Mar-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
minor - update copyright headers |
| 1f9221fe | 11-Mar-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
vec - use CeedSize for vector lengths |
| e6f67ff7 | 18-Feb-2022 |
Jed Brown <jed@jedbrown.org> |
backends cuda-shared: fix launch bounds to avoid invalid z dimension
The typical max z dimension size of a thread block is 64 and we were computing larger values (like 85) in some cases. |
| c47bfe2b | 16-Feb-2022 |
Jed Brown <jed@jedbrown.org> |
backends/cuda-shared: limit 1D thread counts
We need to avoid this error:
CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES: max_threads_per_block 512 on block size (24,1,32), shared_size 0, num_regs 106
A pro
backends/cuda-shared: limit 1D thread counts
We need to avoid this error:
CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES: max_threads_per_block 512 on block size (24,1,32), shared_size 0, num_regs 106
A proper solution is to use cuOccupancyMaxPotentialBlockSize to place a number of elements per block that stays within resource limits. This would involve a bit more refactoring to do cleanly.
show more ...
|
| d7d111ec | 23-Dec-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - style consistency |
| 46dc0734 | 23-Dec-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - improved human-readability of debugging output |
| 437930d1 | 22-Dec-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - pull quoted kernels into separate files |
| f87d896c | 22-Dec-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - drop unused argument in init |
| 7fcac036 | 22-Dec-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - split common cuda/hip data into separate folder |
| 6d69246a | 21-Dec-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
cuda - separate compile functionality into new header |
| 9c774edd | 17-Dec-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
vec/qf - initial valid/borrowed/owned split for data (#853)
* vec/qf - initial valid/borrowed/owned split for data
* vec/qf - tidy logic for checking active/stale data
* minor - add missing NU
vec/qf - initial valid/borrowed/owned split for data (#853)
* vec/qf - initial valid/borrowed/owned split for data
* vec/qf - tidy logic for checking active/stale data
* minor - add missing NULL
* doc - explain VectorTakeArray update
* minor - update error messages
* test - update error message in junit/tap
* gpu - fix stray CeedScalar vs void for QFunctionContext
* vec/qf - clarify/simplify access logic
* vec - calloc host arrays when no value set to make empty
* style - minor
* style - minor
* minor - fix error messages
* vec/qf - move data validity checking to backend interface
* gpu - add missing sync error checking for qfcontext
* gpu - homogonize use of impl for backend data to reduce confusion
* vec - clarify access conditions
* python - update test for stricter vector access
* vec - minor fixes
* minor - fix ipython change
* vec - add missing declarations in ceed/backend.h
* ctx - mirror vector borrowed data check in ctx interface
* vec - add CeedVectorGetArrayWrite
* vec - consistent use of CeedVectorGetArray vs CeedVectorGetArrayWrite
* python - small vec fixes
* doc - describe vector data semantics
* magma - update restriction
* gpu - fix restr bug I added, need to sum into target
* magma - fix restriction bug
* cpu - fix restriction bug here too
* op - fix evec allocations
* julia - fix ElemRestriction for new vector access rules
* op - double check GetArray vs Read vs Write usage
* doc - small fix
* restr - clean up read/write logic for restr
* python - add vec.array_write
* magma - typo fix
show more ...
|
| 80a9ef05 | 02-Sep-2021 |
Natalie Beams <246972+nbeams@users.noreply.github.com> |
Allow CeedScalar to be single precision (#788)
One can modify `ceed.h` to include `ceed-f32.h` and then use single precision. This is tested for C in CI and has been tested by developers with Rust,
Allow CeedScalar to be single precision (#788)
One can modify `ceed.h` to include `ceed-f32.h` and then use single precision. This is tested for C in CI and has been tested by developers with Rust, Julia, and Python. This interface is evolving and should be considered experimental at this time (thus lack of automated build support).
* Introduce CeedScalarType enum
* WIP changes to allow different definitions of CeedScalar
* Introduce new header files for float and double
* Only use avx tensor contract and MAGMA non-tensor basis if CeedScalar is double
* WIP changes to allow CeedScalar to be float
* WIP start trying to adjust test tolerances for float or double
* fix typos in comments
* install ceed-f32/64 headers
* Fix missing casts for hipMAGMA element restrictions
* make CeedQFunctionContextGetContextSize available for Python bindings
* Changes to Python bindings to allow CeedScalar to be float
* WIP adjust Python tests for float or double
* make style
* remove QFunctionContextGetContextSize from backend header
* Use quotes instead of <> in include statement
* Remove unncessary includes
* Update tolerances for tests
* [Julia] allow CeedScalar to be Float32
* [Julia] Use Preferences instead of custom build configuration
# Conflicts:
# julia/LibCEED.jl/src/C.jl
* [Makefile] Change definition of CC_VENDOR so it works with cross-compilation
* [Julia] Use Preferences in CI
# Conflicts:
# .github/workflows/julia-test-with-style.yml
* [Julia] Update docs about preferences
* [Julia] Add test/Project.toml workaround for Preferences
* Add CeedGetScalarType to get the type of CeedScalar at runtime
* [Julia] Move functions from Ceed.jl to LibCEED.jl
* [Julia] Add support for getting library path and scalar type at runtime
* [Julia] Minor change to checking if CUDA is loaded
* [Julia] Check correct CeedScalar types in basis functions
* [Julia] Fix tests comparing with output file
* [Julia] Change devtests to use CeedScalar instead of Float64
* Update test 402 so context will be same size in double or float
* Update tolerances for ceed examples
* [Julia] CUDA fixes
* remove unused variable in t208
* SchurDecomposition: do not compute tau on final iteration
* Update tolerances for some basis tests (for single precision)
* Make style
* Python style fixes for basis test
* Add single precision output for t300 and t320 and adjust checks; skip t541 in single
* Add LCOV exclusions after moving to new line
* fix spacing
* Python: make CEED_EPSILON available as libceed.EPSILON
* Python: optional parameter to specify different output file for test comparison
* Python: update tests' use of EPSILON and change test_300 output file for single precision
* Python: add convenience function for getting dtype corresponding to CeedScalar
* rust - add single precision support
* [Julia] Fall back on Float64 if CeedGetScalarType is not available
* [Julia] style
* Adjust tolerance for t301
* xsmm - add single precision support
* avx - add single precision support
* Add initial single precision support for MAGMA non-tensor basis
* Skip t300 and t320 in single precision; revert Python t300 changes
* Revert output changes for t300 and t320 in junit
* [Julia] Changes to autogenerated bindings for mixed precision
* [Julia] style
* [Julia] Check scalar type when changing libceed library path
The check is also performed when the package is loaded. This prevents having to
restart the Julia session twice
* [Julia] Require JLLWrappers version 1.3
This is needed to use Preferences to change the library path
* Add documentation page for precision development
Co-authored-by: Will Pazner <will.e.p@gmail.com>
* Cleanup from merge: remove old README
* Return CEED_ALIGN to backend.h
* Make Fortran compiler (FC) optional; empty skips Fortran tests
Use in Python and Rust builds, which may not have a Fortran compiler
installed and thus would produce confusing output.
* Add single precision CI test for Noether
Co-authored-by: Jeremy L Thompson <jeremy@jeremylt.org>
Co-authored-by: Will Pazner <will.e.p@gmail.com>
Co-authored-by: Jeremy L Thompson <jeremy@jeremylt.org>
Co-authored-by: Jed Brown <jed@jedbrown.org>
show more ...
|
| 6dbfb411 | 05-Apr-2021 |
nbeams <246972+nbeams@users.noreply.github.com> |
Update device ID selection for HIP/CUDA backends; add for MAGMA backends |
| ec3da8bc | 26-Mar-2021 |
Jed Brown <jed@jedbrown.org> |
Install install backend headers under include/ceed/
This makes it possible to distribute source plugins that provide additional backends. It's also used in MFEM, perhaps temporarily.
Deprecate ceed
Install install backend headers under include/ceed/
This makes it possible to distribute source plugins that provide additional backends. It's also used in MFEM, perhaps temporarily.
Deprecate ceed-backend.h, which was not previously installed, but some users accessed it from an in-place build.
Also install CUDA and HIP headers that allow users to provide CUfunction and hipFunction_t.
Co-authored-by: Jeremy L. Thompson <jeremy.thompson@colorado.edu> Requested-by: Andrew T. Barker <barker29@llnl.gov>
show more ...
|
| e15f9bd0 | 20-Mar-2021 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Error Handling Improvement [fix #696] (#691)
* Operator - add operator/qfunction field compatibility checks
* QFunction - do not allow adding fields to QFunction in use with an operator
* Exam
Error Handling Improvement [fix #696] (#691)
* Operator - add operator/qfunction field compatibility checks
* QFunction - do not allow adding fields to QFunction in use with an operator
* Examples - add some extra exclusion markers in ceed example
* error - add error enum
* error - update error enum names and numbering
* error - use CEED_ERROR_BACKEND in all backend errors
* error - begin classifying interface errors
* error - update backends to use CEED_ERROR_SUCCESS and CeedChkBackend
* error - use new errors in gallery
* error - add some unsaved modifications
* error - improve documentation
* error - define CEED_ERROR_SUCCESS in GPU JiT; we really should have a common header to pipe defines to the JiT code
* error - more error code editing
* error - fix error string
* operator - fix setting field qpts
* basis - add input/output dimension error checking
* python - move basis utility methods to ceed object, no basis required or used
* python - force exit with negative error code
* make style-py
* rust - initial work to add error handling logic
* rust - add ceed.resource method
* rust - add results for methods that may fail
* rust - also format doctests
* minor - drop unused CeedChk()
* error - rename terminal/nonterminal to major/minor
* rust - set ErrorStore as default errorhandler
* python - revert error handing change for python
* python - use success error code from C bindings
* error - only upgrade error code in backend if positive
show more ...
|
| 3d576824 | 29-Jan-2021 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
headers - clearify includes to not rely on transitive includes (#701)
* headers - clearify includes to not rely on transitive includes
* style - add header recommendations from 'include-what-you-
headers - clearify includes to not rely on transitive includes (#701)
* headers - clearify includes to not rely on transitive includes
* style - add header recommendations from 'include-what-you-use'
* style - apply 'include-what-you-use' changes to CUDA backends
* style - 'include-what-you-use' for hip backends
* style - drop ceed.h includes in gallery qf source
* docs - add dev notes for header files
* style - header style and alphabetize
show more ...
|
| 1d013790 | 14-Dec-2020 |
Jed Brown <jed@jedbrown.org> |
Add static library (libceed.a) [resolve #670]
We no longer use __attribute__((constructor)) to register backends and gallery implementations because we can't ensure that the symbols are linked into
Add static library (libceed.a) [resolve #670]
We no longer use __attribute__((constructor)) to register backends and gallery implementations because we can't ensure that the symbols are linked into applications that link the static library. We've switched to creating CeedRegisterAll() and CeedQFunctionRegisterAll(), which are called automatically by the library, and call weak symbols to register all the backend/gallery implementations. This strategy was partly motivated by not wanting to have preprocessor macros describing what is available, and the associated need to recompile rather than just relink when those macros change.
So we now have backends/ceed-backend-list.h that declares all the backends wrapped in a macro. It is included by backends/ceed-backend-weak.c to create weak definitions of all the backends. In the makefile, we sort so this comes last when linking a shared or static library, and thus these weak symbols will only be picked up if they were not defined by the actual backend source files. The same header is included (with different macro wrapping) in interface/ceed-register.c, where CeedRegisterAll() is defined.
To add a new backend, one must do essentially the same registration strategy as in the past, plus add one line to the common ceed-backend-list.h.
show more ...
|