| #
00125730
|
| 25-Jul-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Add missing checks for support of different element restriction types in backends
|
| #
0305e208
|
| 06-May-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Update backends for unified ElemRestrictionCreate variants for all restriction types (default, oriented, strided)
|
| #
4b35598d
|
| 20-Jun-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #1231 from CEED/jeremy/consistency
Consistency fixes
|
| #
eb7e6caf
|
| 16-Jun-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - naming consistency fixes
|
| #
30d6126f
|
| 25-Apr-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #1202 from CEED/jeremy/const-fix
Clean up backend headers for const and argument names
|
| #
472941f0
|
| 21-Apr-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
minor - fix static vs CEED_INTERN in backend file
|
| #
51475c7c
|
| 20-Apr-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
minor - clean up backend headers for const and argument names
|
| #
6e6704a8
|
| 19-Apr-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #1198 from CEED/jeremy/CeedCheck
Add CeedCheck macro to reduce repetition
|
| #
6574a04f
|
| 18-Apr-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
internal - add CeedCheck macro to reduce repetition
|
| #
49aac155
|
| 24-Mar-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
IWYU fixes (#1182)
* iwyu - include fixes
* iwyu - silence some iwyu output
* minor - clearer macro names
* iwyu - fix suggestion of "ceed/ceed.h" externally
* iwyu - lighter petsc heade
IWYU fixes (#1182)
* iwyu - include fixes
* iwyu - silence some iwyu output
* minor - clearer macro names
* iwyu - fix suggestion of "ceed/ceed.h" externally
* iwyu - lighter petsc headers
* iwyu - ceed/ceed.h -> ceed.h
* iwyu - cuda/hip include fixes
show more ...
|
| #
023b8a51
|
| 25-Jan-2023 |
abdelfattah83 <36712794+abdelfattah83@users.noreply.github.com> |
magma: non-tensor rtc (#1141)
* some refactoring in magma's jit src
* fix path
* fix loading src
* refactor magma nontensor backend
* refactor magma nontensor backend
* [WIP]: new non
magma: non-tensor rtc (#1141)
* some refactoring in magma's jit src
* fix path
* fix loading src
* refactor magma nontensor backend
* refactor magma nontensor backend
* [WIP]: new nontensor basis kernels
* [WIP]: new nontensor basis kernels
* [WIP]: new nontensor basis kernels
* call the new nontensor kernels for low order problems
* multiple compilation for the same kernels but with different tuning parmaters
* magma: allow different nb's for different non-tensor kernels
* tuning data for the non-tensor rtc kernels
* remove no-longer used functions, add new one for tuning the nontensor kernels
* constants for tuning
* tuning functions
* use the tuning functions in compiling/running the new kernels
* bug fix
* fixes
* fixes
* minor
* switch tuning data
* fix name
* fix name
* add function to run cuda kernels with opt-in shared memory feature
* minor fix
* minor fix
* fix calls to batch api
* allow more kernel instances
* temporary timing function
* temporary timing function
* tuning data based on hiprtc
* rollback tuning parameters
* fixes
* fixes
* fix inconsistency in the parameters passed to nvrtc/hiprtc
* minor
* a fix to the nb selector
* cleanup
* merge the opt-in feature in CeedRunKernelDimSharedOptinCuda into CeedRunKernelDimSharedCuda
* fix paths for hip-magma backends
* style
* fixes
* running make format
* undo changes from the last commit
* change HIP_DIR to ROCM_DIR and adjust the paths for magma accordingly
* replace HIP_DIR with ROCM_DIR
show more ...
|
| #
2b730f8b
|
| 17-Nov-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Switch to clang-format (#1051)
* style - switch to clang-format
* ci - use newer libxsmm
* action - update format action
* format - consistent use of {} for multi-line if/for
* make - re
Switch to clang-format (#1051)
* style - switch to clang-format
* ci - use newer libxsmm
* action - update format action
* format - consistent use of {} for multi-line if/for
* make - remove stray newline
* make - simpler 'make format' target
* ci - use newer libxsmm
* doc - minor release note claification
* minor - minor fix
* minor - minor fix
* minor - minor fix
* minor - minor fix
* make format
* format - less aggressive alignment rules
* tidy - check for argument name mismatches
* fix newline
* format - mirror Ratel update to .clang-format
* fix merge error
* fix merge conflict
* fix merge error
* drop style in .phony list
* Update .clang-format
Co-authored-by: Jed Brown <jed@jedbrown.org>
* apply updated format
Co-authored-by: Jed Brown <jed@jedbrown.org>
show more ...
|
| #
2dc3fb5f
|
| 31-Aug-2022 |
abdelfattah83 <36712794+abdelfattah83@users.noreply.github.com> |
Icl/magma ntgemm (#1060)
* tuning data and driver for the non-tensor gemm
* header
* update magma non-tensor sgemm/dgemm to use the gemm selector
* add cpp files for the magma backend
*
Icl/magma ntgemm (#1060)
* tuning data and driver for the non-tensor gemm
* header
* update magma non-tensor sgemm/dgemm to use the gemm selector
* add cpp files for the magma backend
* minor fix
* define CEED_INTERN for every function instead of a block definition
* include tuning data for CUDA or HIP only
* recent tuning data for a100 and mi250x
* style
* remove unused declarations
* expand tuning data for v100 and mi100
* switch to std array instead of std vector for individual records
* choose between gfx90a and gfx908 for HIP
* bug fix: choose between magma and vendor blas in non-batch mode
* style
show more ...
|
| #
3cb13594
|
| 27-Jun-2022 |
Natalie Beams <246972+nbeams@users.noreply.github.com> |
Merge pull request #973 from CEED/icl/magma-rtc
Use RTC for MAGMA tensor basis kernels and element restrictions
|
| #
c42f38b1
|
| 24-Jun-2022 |
nbeams <246972+nbeams@users.noreply.github.com> |
Change naming style for MAGMA runtime compilation type/function defines
|
| #
e5f091eb
|
| 08-Jun-2022 |
nbeams <246972+nbeams@users.noreply.github.com> |
MAGMA: Use more specific macro name for HIP mode
|
| #
f6af633f
|
| 06-May-2022 |
nbeams <246972+nbeams@users.noreply.github.com> |
Use rtc for MAGMA elem restriction and tensor basis kernels
|
| #
ce18bed9
|
| 17-Mar-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #858 from CEED/jeremy/dump-copy-stuff
Strip redundant/outdated license info duplication
|
| #
3d8e8822
|
| 17-Mar-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
minor - update copyright headers
|
| #
f99981a3
|
| 25-Feb-2022 |
Jed Brown <jed@jedbrown.org> |
Merge pull request #893 from CEED/natalie/more-hip-launch-bounds
HIP: add atomics flag and more kernel launch bounds for performance improvements
|
| #
e2cfdb03
|
| 18-Feb-2022 |
Jed Brown <jed@jedbrown.org> |
Merge pull request #902 from CEED/jed/cuda-block-sizes
backends cuda-shared: fix launch bounds to avoid invalid z dimension
|
| #
c8b3a627
|
| 18-Feb-2022 |
Jed Brown <jed@jedbrown.org> |
backends/magma: fix pinned vs unpinned memory free bug
|
| #
f71aa81b
|
| 01-Feb-2022 |
nbeams <246972+nbeams@users.noreply.github.com> |
add launch bounds to magma kernels; add macro definition for y-dim of magma basis kernel threadblocks
Co-authored-by: Ahmad Abdelfattah <ahmad@icl.utk.edu>
|
| #
80a9ef05
|
| 02-Sep-2021 |
Natalie Beams <246972+nbeams@users.noreply.github.com> |
Allow CeedScalar to be single precision (#788)
One can modify `ceed.h` to include `ceed-f32.h` and then use single precision. This is tested for C in CI and has been tested by developers with Rust,
Allow CeedScalar to be single precision (#788)
One can modify `ceed.h` to include `ceed-f32.h` and then use single precision. This is tested for C in CI and has been tested by developers with Rust, Julia, and Python. This interface is evolving and should be considered experimental at this time (thus lack of automated build support).
* Introduce CeedScalarType enum
* WIP changes to allow different definitions of CeedScalar
* Introduce new header files for float and double
* Only use avx tensor contract and MAGMA non-tensor basis if CeedScalar is double
* WIP changes to allow CeedScalar to be float
* WIP start trying to adjust test tolerances for float or double
* fix typos in comments
* install ceed-f32/64 headers
* Fix missing casts for hipMAGMA element restrictions
* make CeedQFunctionContextGetContextSize available for Python bindings
* Changes to Python bindings to allow CeedScalar to be float
* WIP adjust Python tests for float or double
* make style
* remove QFunctionContextGetContextSize from backend header
* Use quotes instead of <> in include statement
* Remove unncessary includes
* Update tolerances for tests
* [Julia] allow CeedScalar to be Float32
* [Julia] Use Preferences instead of custom build configuration
# Conflicts:
# julia/LibCEED.jl/src/C.jl
* [Makefile] Change definition of CC_VENDOR so it works with cross-compilation
* [Julia] Use Preferences in CI
# Conflicts:
# .github/workflows/julia-test-with-style.yml
* [Julia] Update docs about preferences
* [Julia] Add test/Project.toml workaround for Preferences
* Add CeedGetScalarType to get the type of CeedScalar at runtime
* [Julia] Move functions from Ceed.jl to LibCEED.jl
* [Julia] Add support for getting library path and scalar type at runtime
* [Julia] Minor change to checking if CUDA is loaded
* [Julia] Check correct CeedScalar types in basis functions
* [Julia] Fix tests comparing with output file
* [Julia] Change devtests to use CeedScalar instead of Float64
* Update test 402 so context will be same size in double or float
* Update tolerances for ceed examples
* [Julia] CUDA fixes
* remove unused variable in t208
* SchurDecomposition: do not compute tau on final iteration
* Update tolerances for some basis tests (for single precision)
* Make style
* Python style fixes for basis test
* Add single precision output for t300 and t320 and adjust checks; skip t541 in single
* Add LCOV exclusions after moving to new line
* fix spacing
* Python: make CEED_EPSILON available as libceed.EPSILON
* Python: optional parameter to specify different output file for test comparison
* Python: update tests' use of EPSILON and change test_300 output file for single precision
* Python: add convenience function for getting dtype corresponding to CeedScalar
* rust - add single precision support
* [Julia] Fall back on Float64 if CeedGetScalarType is not available
* [Julia] style
* Adjust tolerance for t301
* xsmm - add single precision support
* avx - add single precision support
* Add initial single precision support for MAGMA non-tensor basis
* Skip t300 and t320 in single precision; revert Python t300 changes
* Revert output changes for t300 and t320 in junit
* [Julia] Changes to autogenerated bindings for mixed precision
* [Julia] style
* [Julia] Check scalar type when changing libceed library path
The check is also performed when the package is loaded. This prevents having to
restart the Julia session twice
* [Julia] Require JLLWrappers version 1.3
This is needed to use Preferences to change the library path
* Add documentation page for precision development
Co-authored-by: Will Pazner <will.e.p@gmail.com>
* Cleanup from merge: remove old README
* Return CEED_ALIGN to backend.h
* Make Fortran compiler (FC) optional; empty skips Fortran tests
Use in Python and Rust builds, which may not have a Fortran compiler
installed and thus would produce confusing output.
* Add single precision CI test for Noether
Co-authored-by: Jeremy L Thompson <jeremy@jeremylt.org>
Co-authored-by: Will Pazner <will.e.p@gmail.com>
Co-authored-by: Jeremy L Thompson <jeremy@jeremylt.org>
Co-authored-by: Jed Brown <jed@jedbrown.org>
show more ...
|
| #
972b3d9d
|
| 27-May-2021 |
Natalie Beams <246972+nbeams@users.noreply.github.com> |
Minor fixes in backends/hip and backends/magma (#771)
* fix typo in ceed-magma header def
* Change setting of gcnArchName option to avoid string overflow
|