History log of /libCEED/rust/libceed-sys/c-src/backends/magma/ceed-magma.h (Results 26 – 50 of 83)
Revision Date Author Comments
# 00125730 25-Jul-2023 Sebastian Grimberg <sjg@amazon.com>

Add missing checks for support of different element restriction types in backends


# 0305e208 06-May-2023 Sebastian Grimberg <sjg@amazon.com>

Update backends for unified ElemRestrictionCreate variants for all restriction types (default, oriented, strided)


# 4b35598d 20-Jun-2023 Jeremy L Thompson <jeremy@jeremylt.org>

Merge pull request #1231 from CEED/jeremy/consistency

Consistency fixes


# eb7e6caf 16-Jun-2023 Jeremy L Thompson <jeremy@jeremylt.org>

gpu - naming consistency fixes


# 30d6126f 25-Apr-2023 Jeremy L Thompson <jeremy@jeremylt.org>

Merge pull request #1202 from CEED/jeremy/const-fix

Clean up backend headers for const and argument names


# 472941f0 21-Apr-2023 Jeremy L Thompson <jeremy@jeremylt.org>

minor - fix static vs CEED_INTERN in backend file


# 51475c7c 20-Apr-2023 Jeremy L Thompson <jeremy@jeremylt.org>

minor - clean up backend headers for const and argument names


# 6e6704a8 19-Apr-2023 Jeremy L Thompson <jeremy@jeremylt.org>

Merge pull request #1198 from CEED/jeremy/CeedCheck

Add CeedCheck macro to reduce repetition


# 6574a04f 18-Apr-2023 Jeremy L Thompson <jeremy@jeremylt.org>

internal - add CeedCheck macro to reduce repetition


# 49aac155 24-Mar-2023 Jeremy L Thompson <jeremy@jeremylt.org>

IWYU fixes (#1182)

* iwyu - include fixes

* iwyu - silence some iwyu output

* minor - clearer macro names

* iwyu - fix suggestion of "ceed/ceed.h" externally

* iwyu - lighter petsc heade

IWYU fixes (#1182)

* iwyu - include fixes

* iwyu - silence some iwyu output

* minor - clearer macro names

* iwyu - fix suggestion of "ceed/ceed.h" externally

* iwyu - lighter petsc headers

* iwyu - ceed/ceed.h -> ceed.h

* iwyu - cuda/hip include fixes

show more ...


# 023b8a51 25-Jan-2023 abdelfattah83 <36712794+abdelfattah83@users.noreply.github.com>

magma: non-tensor rtc (#1141)

* some refactoring in magma's jit src

* fix path

* fix loading src

* refactor magma nontensor backend

* refactor magma nontensor backend

* [WIP]: new non

magma: non-tensor rtc (#1141)

* some refactoring in magma's jit src

* fix path

* fix loading src

* refactor magma nontensor backend

* refactor magma nontensor backend

* [WIP]: new nontensor basis kernels

* [WIP]: new nontensor basis kernels

* [WIP]: new nontensor basis kernels

* call the new nontensor kernels for low order problems

* multiple compilation for the same kernels but with different tuning parmaters

* magma: allow different nb's for different non-tensor kernels

* tuning data for the non-tensor rtc kernels

* remove no-longer used functions, add new one for tuning the nontensor kernels

* constants for tuning

* tuning functions

* use the tuning functions in compiling/running the new kernels

* bug fix

* fixes

* fixes

* minor

* switch tuning data

* fix name

* fix name

* add function to run cuda kernels with opt-in shared memory feature

* minor fix

* minor fix

* fix calls to batch api

* allow more kernel instances

* temporary timing function

* temporary timing function

* tuning data based on hiprtc

* rollback tuning parameters

* fixes

* fixes

* fix inconsistency in the parameters passed to nvrtc/hiprtc

* minor

* a fix to the nb selector

* cleanup

* merge the opt-in feature in CeedRunKernelDimSharedOptinCuda into CeedRunKernelDimSharedCuda

* fix paths for hip-magma backends

* style

* fixes

* running make format

* undo changes from the last commit

* change HIP_DIR to ROCM_DIR and adjust the paths for magma accordingly

* replace HIP_DIR with ROCM_DIR

show more ...


# 2b730f8b 17-Nov-2022 Jeremy L Thompson <jeremy@jeremylt.org>

Switch to clang-format (#1051)

* style - switch to clang-format

* ci - use newer libxsmm

* action - update format action

* format - consistent use of {} for multi-line if/for

* make - re

Switch to clang-format (#1051)

* style - switch to clang-format

* ci - use newer libxsmm

* action - update format action

* format - consistent use of {} for multi-line if/for

* make - remove stray newline

* make - simpler 'make format' target

* ci - use newer libxsmm

* doc - minor release note claification

* minor - minor fix

* minor - minor fix

* minor - minor fix

* minor - minor fix

* make format

* format - less aggressive alignment rules

* tidy - check for argument name mismatches

* fix newline

* format - mirror Ratel update to .clang-format

* fix merge error

* fix merge conflict

* fix merge error

* drop style in .phony list

* Update .clang-format

Co-authored-by: Jed Brown <jed@jedbrown.org>

* apply updated format

Co-authored-by: Jed Brown <jed@jedbrown.org>

show more ...


# 2dc3fb5f 31-Aug-2022 abdelfattah83 <36712794+abdelfattah83@users.noreply.github.com>

Icl/magma ntgemm (#1060)

* tuning data and driver for the non-tensor gemm

* header

* update magma non-tensor sgemm/dgemm to use the gemm selector

* add cpp files for the magma backend

*

Icl/magma ntgemm (#1060)

* tuning data and driver for the non-tensor gemm

* header

* update magma non-tensor sgemm/dgemm to use the gemm selector

* add cpp files for the magma backend

* minor fix

* define CEED_INTERN for every function instead of a block definition

* include tuning data for CUDA or HIP only

* recent tuning data for a100 and mi250x

* style

* remove unused declarations

* expand tuning data for v100 and mi100

* switch to std array instead of std vector for individual records

* choose between gfx90a and gfx908 for HIP

* bug fix: choose between magma and vendor blas in non-batch mode

* style

show more ...


# 3cb13594 27-Jun-2022 Natalie Beams <246972+nbeams@users.noreply.github.com>

Merge pull request #973 from CEED/icl/magma-rtc

Use RTC for MAGMA tensor basis kernels and element restrictions


# c42f38b1 24-Jun-2022 nbeams <246972+nbeams@users.noreply.github.com>

Change naming style for MAGMA runtime compilation type/function defines


# e5f091eb 08-Jun-2022 nbeams <246972+nbeams@users.noreply.github.com>

MAGMA: Use more specific macro name for HIP mode


# f6af633f 06-May-2022 nbeams <246972+nbeams@users.noreply.github.com>

Use rtc for MAGMA elem restriction and tensor basis kernels


# ce18bed9 17-Mar-2022 Jeremy L Thompson <jeremy@jeremylt.org>

Merge pull request #858 from CEED/jeremy/dump-copy-stuff

Strip redundant/outdated license info duplication


# 3d8e8822 17-Mar-2022 Jeremy L Thompson <jeremy@jeremylt.org>

minor - update copyright headers


# f99981a3 25-Feb-2022 Jed Brown <jed@jedbrown.org>

Merge pull request #893 from CEED/natalie/more-hip-launch-bounds

HIP: add atomics flag and more kernel launch bounds for performance improvements


# e2cfdb03 18-Feb-2022 Jed Brown <jed@jedbrown.org>

Merge pull request #902 from CEED/jed/cuda-block-sizes

backends cuda-shared: fix launch bounds to avoid invalid z dimension


# c8b3a627 18-Feb-2022 Jed Brown <jed@jedbrown.org>

backends/magma: fix pinned vs unpinned memory free bug


# f71aa81b 01-Feb-2022 nbeams <246972+nbeams@users.noreply.github.com>

add launch bounds to magma kernels;
add macro definition for y-dim of magma basis kernel threadblocks

Co-authored-by: Ahmad Abdelfattah <ahmad@icl.utk.edu>


# 80a9ef05 02-Sep-2021 Natalie Beams <246972+nbeams@users.noreply.github.com>

Allow CeedScalar to be single precision (#788)

One can modify `ceed.h` to include `ceed-f32.h` and then use single precision. This is tested for C in CI and has been tested by developers with Rust,

Allow CeedScalar to be single precision (#788)

One can modify `ceed.h` to include `ceed-f32.h` and then use single precision. This is tested for C in CI and has been tested by developers with Rust, Julia, and Python. This interface is evolving and should be considered experimental at this time (thus lack of automated build support).

* Introduce CeedScalarType enum

* WIP changes to allow different definitions of CeedScalar

* Introduce new header files for float and double

* Only use avx tensor contract and MAGMA non-tensor basis if CeedScalar is double

* WIP changes to allow CeedScalar to be float

* WIP start trying to adjust test tolerances for float or double

* fix typos in comments

* install ceed-f32/64 headers

* Fix missing casts for hipMAGMA element restrictions

* make CeedQFunctionContextGetContextSize available for Python bindings

* Changes to Python bindings to allow CeedScalar to be float

* WIP adjust Python tests for float or double

* make style

* remove QFunctionContextGetContextSize from backend header

* Use quotes instead of <> in include statement

* Remove unncessary includes

* Update tolerances for tests

* [Julia] allow CeedScalar to be Float32

* [Julia] Use Preferences instead of custom build configuration

# Conflicts:
# julia/LibCEED.jl/src/C.jl

* [Makefile] Change definition of CC_VENDOR so it works with cross-compilation

* [Julia] Use Preferences in CI

# Conflicts:
# .github/workflows/julia-test-with-style.yml

* [Julia] Update docs about preferences

* [Julia] Add test/Project.toml workaround for Preferences

* Add CeedGetScalarType to get the type of CeedScalar at runtime

* [Julia] Move functions from Ceed.jl to LibCEED.jl

* [Julia] Add support for getting library path and scalar type at runtime

* [Julia] Minor change to checking if CUDA is loaded

* [Julia] Check correct CeedScalar types in basis functions

* [Julia] Fix tests comparing with output file

* [Julia] Change devtests to use CeedScalar instead of Float64

* Update test 402 so context will be same size in double or float

* Update tolerances for ceed examples

* [Julia] CUDA fixes

* remove unused variable in t208

* SchurDecomposition: do not compute tau on final iteration

* Update tolerances for some basis tests (for single precision)

* Make style

* Python style fixes for basis test

* Add single precision output for t300 and t320 and adjust checks; skip t541 in single

* Add LCOV exclusions after moving to new line

* fix spacing

* Python: make CEED_EPSILON available as libceed.EPSILON

* Python: optional parameter to specify different output file for test comparison

* Python: update tests' use of EPSILON and change test_300 output file for single precision

* Python: add convenience function for getting dtype corresponding to CeedScalar

* rust - add single precision support

* [Julia] Fall back on Float64 if CeedGetScalarType is not available

* [Julia] style

* Adjust tolerance for t301

* xsmm - add single precision support

* avx - add single precision support

* Add initial single precision support for MAGMA non-tensor basis

* Skip t300 and t320 in single precision; revert Python t300 changes

* Revert output changes for t300 and t320 in junit

* [Julia] Changes to autogenerated bindings for mixed precision

* [Julia] style

* [Julia] Check scalar type when changing libceed library path

The check is also performed when the package is loaded. This prevents having to
restart the Julia session twice

* [Julia] Require JLLWrappers version 1.3

This is needed to use Preferences to change the library path

* Add documentation page for precision development

Co-authored-by: Will Pazner <will.e.p@gmail.com>

* Cleanup from merge: remove old README

* Return CEED_ALIGN to backend.h

* Make Fortran compiler (FC) optional; empty skips Fortran tests

Use in Python and Rust builds, which may not have a Fortran compiler
installed and thus would produce confusing output.

* Add single precision CI test for Noether

Co-authored-by: Jeremy L Thompson <jeremy@jeremylt.org>

Co-authored-by: Will Pazner <will.e.p@gmail.com>
Co-authored-by: Jeremy L Thompson <jeremy@jeremylt.org>
Co-authored-by: Jed Brown <jed@jedbrown.org>

show more ...


# 972b3d9d 27-May-2021 Natalie Beams <246972+nbeams@users.noreply.github.com>

Minor fixes in backends/hip and backends/magma (#771)

* fix typo in ceed-magma header def

* Change setting of gcnArchName option to avoid string overflow


1234