| #
7c735608
|
| 19-Mar-2022 |
Jed Brown <jed@jedbrown.org> |
Makefile: support CUDA stubs and LDFLAGS override
Co-authored-by: Stefano Zampini <stefano.zampini@gmail.com>
|
| #
fb73140c
|
| 18-Mar-2022 |
Jed Brown <jed@jedbrown.org> |
Merge pull request #923 from CEED/natalie/simplify-magma-atomics
MAGMA: simplify atomic add usage, improve installation
|
| #
a11a3c55
|
| 17-Mar-2022 |
nbeams <246972+nbeams@users.noreply.github.com> |
MAGMA: simplify atomic add usage and reduce MAGMA header file usage
|
| #
ce18bed9
|
| 17-Mar-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #858 from CEED/jeremy/dump-copy-stuff
Strip redundant/outdated license info duplication
|
| #
3d8e8822
|
| 17-Mar-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
minor - update copyright headers
|
| #
c30a57c5
|
| 16-Mar-2022 |
Jed Brown <jed@jedbrown.org> |
Merge pull request #917 from CEED/jed/makefile-avx-cflags
Visibility and AVX flag detection
|
| #
130eedf9
|
| 16-Mar-2022 |
Jed Brown <jed@jedbrown.org> |
Makefile: AVX detection needs CFLAGS
Package managers (Spack, PETSc --download-libceed) will typically specify CFLAGS directly rather than using the OPT= shortcut. The test needs to use the flags th
Makefile: AVX detection needs CFLAGS
Package managers (Spack, PETSc --download-libceed) will typically specify CFLAGS directly rather than using the OPT= shortcut. The test needs to use the flags they set, not our default (which includes -march=native).
show more ...
|
| #
b76375bc
|
| 25-Feb-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #906 from CEED/jeremy/small-make-fix
make - fix env BACKENDS detection
|
| #
5de894e4
|
| 23-Feb-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
make - fix env BACKENDS detection
|
| #
f99981a3
|
| 25-Feb-2022 |
Jed Brown <jed@jedbrown.org> |
Merge pull request #893 from CEED/natalie/more-hip-launch-bounds
HIP: add atomics flag and more kernel launch bounds for performance improvements
|
| #
b3c5430c
|
| 01-Feb-2022 |
nbeams <246972+nbeams@users.noreply.github.com> |
Add flag to use atomic adds on supported AMD GPU hardware
|
| #
9477ba7d
|
| 25-Jan-2022 |
Jed Brown <jed@jedbrown.org> |
Merge pull request #879 from CEED/jed/doxygen-cleanup
Doxygen cleanup
|
| #
63be1c69
|
| 21-Jan-2022 |
Jed Brown <jed@jedbrown.org> |
Makefile: quiet Doxygen output so warnings are more noticeable
|
| #
d92fedf5
|
| 22-Dec-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #863 from CEED/jeremy/gpu-jit-code
GPU - separate common code into separate folder
|
| #
7fcac036
|
| 22-Dec-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - split common cuda/hip data into separate folder
|
| #
e92bacf0
|
| 19-Nov-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #838 from CEED/jeremy/iwyu
Fix iwyu target
|
| #
9c06f60a
|
| 08-Nov-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
make - fix iwyu target
|
| #
83057686
|
| 08-Nov-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #831 from CEED/jeremy/minor-typo
doc - minor typo
|
| #
db52d626
|
| 08-Nov-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
make - add iwyu makefile target
|
| #
5b1362b9
|
| 29-Oct-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #833 from CEED/jeremy/minor-verbose
makefile - respond to either V or VERBOSE
|
| #
8df0376f
|
| 29-Oct-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
makefile - respond to either V or VERBOSE
|
| #
aae8ce39
|
| 01-Oct-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #820 from CEED/jeremy/warn-occa
Add OCCA backend warning
|
| #
9a1d3511
|
| 30-Sep-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
minor - remove deprecated headers
|
| #
80a9ef05
|
| 02-Sep-2021 |
Natalie Beams <246972+nbeams@users.noreply.github.com> |
Allow CeedScalar to be single precision (#788)
One can modify `ceed.h` to include `ceed-f32.h` and then use single precision. This is tested for C in CI and has been tested by developers with Rust,
Allow CeedScalar to be single precision (#788)
One can modify `ceed.h` to include `ceed-f32.h` and then use single precision. This is tested for C in CI and has been tested by developers with Rust, Julia, and Python. This interface is evolving and should be considered experimental at this time (thus lack of automated build support).
* Introduce CeedScalarType enum
* WIP changes to allow different definitions of CeedScalar
* Introduce new header files for float and double
* Only use avx tensor contract and MAGMA non-tensor basis if CeedScalar is double
* WIP changes to allow CeedScalar to be float
* WIP start trying to adjust test tolerances for float or double
* fix typos in comments
* install ceed-f32/64 headers
* Fix missing casts for hipMAGMA element restrictions
* make CeedQFunctionContextGetContextSize available for Python bindings
* Changes to Python bindings to allow CeedScalar to be float
* WIP adjust Python tests for float or double
* make style
* remove QFunctionContextGetContextSize from backend header
* Use quotes instead of <> in include statement
* Remove unncessary includes
* Update tolerances for tests
* [Julia] allow CeedScalar to be Float32
* [Julia] Use Preferences instead of custom build configuration
# Conflicts:
# julia/LibCEED.jl/src/C.jl
* [Makefile] Change definition of CC_VENDOR so it works with cross-compilation
* [Julia] Use Preferences in CI
# Conflicts:
# .github/workflows/julia-test-with-style.yml
* [Julia] Update docs about preferences
* [Julia] Add test/Project.toml workaround for Preferences
* Add CeedGetScalarType to get the type of CeedScalar at runtime
* [Julia] Move functions from Ceed.jl to LibCEED.jl
* [Julia] Add support for getting library path and scalar type at runtime
* [Julia] Minor change to checking if CUDA is loaded
* [Julia] Check correct CeedScalar types in basis functions
* [Julia] Fix tests comparing with output file
* [Julia] Change devtests to use CeedScalar instead of Float64
* Update test 402 so context will be same size in double or float
* Update tolerances for ceed examples
* [Julia] CUDA fixes
* remove unused variable in t208
* SchurDecomposition: do not compute tau on final iteration
* Update tolerances for some basis tests (for single precision)
* Make style
* Python style fixes for basis test
* Add single precision output for t300 and t320 and adjust checks; skip t541 in single
* Add LCOV exclusions after moving to new line
* fix spacing
* Python: make CEED_EPSILON available as libceed.EPSILON
* Python: optional parameter to specify different output file for test comparison
* Python: update tests' use of EPSILON and change test_300 output file for single precision
* Python: add convenience function for getting dtype corresponding to CeedScalar
* rust - add single precision support
* [Julia] Fall back on Float64 if CeedGetScalarType is not available
* [Julia] style
* Adjust tolerance for t301
* xsmm - add single precision support
* avx - add single precision support
* Add initial single precision support for MAGMA non-tensor basis
* Skip t300 and t320 in single precision; revert Python t300 changes
* Revert output changes for t300 and t320 in junit
* [Julia] Changes to autogenerated bindings for mixed precision
* [Julia] style
* [Julia] Check scalar type when changing libceed library path
The check is also performed when the package is loaded. This prevents having to
restart the Julia session twice
* [Julia] Require JLLWrappers version 1.3
This is needed to use Preferences to change the library path
* Add documentation page for precision development
Co-authored-by: Will Pazner <will.e.p@gmail.com>
* Cleanup from merge: remove old README
* Return CEED_ALIGN to backend.h
* Make Fortran compiler (FC) optional; empty skips Fortran tests
Use in Python and Rust builds, which may not have a Fortran compiler
installed and thus would produce confusing output.
* Add single precision CI test for Noether
Co-authored-by: Jeremy L Thompson <jeremy@jeremylt.org>
Co-authored-by: Will Pazner <will.e.p@gmail.com>
Co-authored-by: Jeremy L Thompson <jeremy@jeremylt.org>
Co-authored-by: Jed Brown <jed@jedbrown.org>
show more ...
|
| #
3c17d89b
|
| 29-Aug-2021 |
Jed Brown <jed@jedbrown.org> |
Merge pull request #803 from CEED/jed/test-junit
testing updates: junit classname, bpsraw tolerances, CUDA on lv
|