History log of /libCEED/backends/hip-shared/ceed-hip-shared.h (Results 1 – 25 of 44)
Revision Date Author Comments
# d4cc1845 30-Dec-2025 Jeremy L Thompson <jeremy@jeremylt.org>

Merge pull request #1912 from CEED/jeremy/copyright

minor - update copyright to 2026


# 9ba83ac0 19-Dec-2025 Jeremy L Thompson <jeremy@jeremylt.org>

minor - update copyright to 2026


# 20a16a5f 20-Mar-2025 Jeremy L Thompson <jeremy@jeremylt.org>

Merge pull request #1786 from CEED/jeremy/copy-headers

minor - upate copyright to 2025


# d275d636 19-Mar-2025 Jeremy L Thompson <jeremy@jeremylt.org>

minor - upate copyright to 2025


# 8c2d8641 13-Feb-2025 Jeremy L Thompson <jeremy@jeremylt.org>

Merge pull request #1754 from CEED/jeremy/shared-points-transpose

gpu - add Transpose/TransposeAdd variants for AtPoints


# af0e6e89 13-Feb-2025 Jeremy L Thompson <jeremy@jeremylt.org>

gpu - add Transpose/TransposeAdd variants for AtPoints


# 1a63be7e 09-Jan-2025 Jeremy L Thompson <jeremy@jeremylt.org>

Merge pull request #1721 from CEED/jeremy/shared-nontensor

Add non-tensor shared


# 6c13bbcb 07-Jan-2025 Jeremy L Thompson <jeremy@jeremylt.org>

hip - add nontensor shared


# be8d6f55 12-Nov-2024 Jeremy L Thompson <jeremy@jeremylt.org>

Merge pull request #1710 from CEED/jeremy/split-at-points

Split AtPoints basis between Transpose/no


# 81ae6159 11-Nov-2024 Jeremy L Thompson <jeremy@jeremylt.org>

gpu - split AtPoints basis between Transpose/no


# 25c4e04a 05-Sep-2024 Jeremy L Thompson <jeremy@jeremylt.org>

Merge pull request #1655 from CEED/jeremy/at-points-transpose

AtPoints - fix transpose basis apply on GPU


# 111870fe 04-Sep-2024 Jeremy L Thompson <jeremy@jeremylt.org>

AtPoints - fix transpose basis apply on GPU


# db2becc9 13-Aug-2024 Jeremy L Thompson <jeremy@jeremylt.org>

Add CeedBasisApplyAdd (#1644)

* basis - add CeedBasisApplyAdd + CPU impl

* basis - add ref GPU ApplyAdd

* basis - add shared GPU ApplyAdd

* basis - add MAGMA ApplyAdd

* basis - add CeedB

Add CeedBasisApplyAdd (#1644)

* basis - add CeedBasisApplyAdd + CPU impl

* basis - add ref GPU ApplyAdd

* basis - add shared GPU ApplyAdd

* basis - add MAGMA ApplyAdd

* basis - add CeedBasisApplyAddAtPoints + default impl

* basis - add GPU ApplyAddAtPoints

* tidy - add extra assert to fix clang-tidy

* Apply suggestions from code review

style - consistently use indexing over pointer arithmatic

Co-authored-by: Zach Atkins <zach.atkins@colorado.edu>

* style - more pointer fixes

---------

Co-authored-by: Zach Atkins <zach.atkins@colorado.edu>

show more ...


# 37fb1fa7 11-Jul-2024 Jeremy L Thompson <jeremy@jeremylt.org>

Merge pull request #1597 from CEED/jeremy/basis-at-points-gpu

GPU BasisApplyAtPoints


# 1dda9c1a 17-Jun-2024 Jeremy L Thompson <jeremy@jeremylt.org>

gpu - add intial AtPoints to shared mem backends, but using ref impl


# 509d4af6 28-Mar-2024 Jeremy L Thompson <jeremy@jeremylt.org>

Drop JiT Guards in Most QF Source (#1540)

* backend - use pragma once

* gallery - drop source header guards

* ex - drop some qfunction source header guards

* fluids - drop guards on all sin

Drop JiT Guards in Most QF Source (#1540)

* backend - use pragma once

* gallery - drop source header guards

* ex - drop some qfunction source header guards

* fluids - drop guards on all singly included headers

* jit - drop most guards on backend JiT files

* sycl - drop extra header guards

* jit - enable #pragma once for QF source

* fluids - use #pragma once for util/helper qf source

* test - check different multiple includes

* fluids - fix odd include

* jit - update interface for building JiT string from multiple files

show more ...


# a171b6ef 27-Mar-2024 Jeremy L Thompson <jeremy@jeremylt.org>

Merge pull request #1537 from CEED/jeremy/pragma-once

Use #pragma once for non-JiT headers


# 5aed82e4 27-Mar-2024 Jeremy L Thompson <jeremy@jeremylt.org>

minor - update copyright headers


# 31c137a9 01-Sep-2023 Jeremy L Thompson <jeremy@jeremylt.org>

Merge pull request #1320 from CEED/jeremy/jit-header-guards

style - fix header guards


# 94b7b29b 01-Sep-2023 Jeremy L Thompson <jeremy@jeremylt.org>

style - fix header guards


# 6e6704a8 19-Apr-2023 Jeremy L Thompson <jeremy@jeremylt.org>

Merge pull request #1198 from CEED/jeremy/CeedCheck

Add CeedCheck macro to reduce repetition


# 6574a04f 18-Apr-2023 Jeremy L Thompson <jeremy@jeremylt.org>

internal - add CeedCheck macro to reduce repetition


# 49aac155 24-Mar-2023 Jeremy L Thompson <jeremy@jeremylt.org>

IWYU fixes (#1182)

* iwyu - include fixes

* iwyu - silence some iwyu output

* minor - clearer macro names

* iwyu - fix suggestion of "ceed/ceed.h" externally

* iwyu - lighter petsc heade

IWYU fixes (#1182)

* iwyu - include fixes

* iwyu - silence some iwyu output

* minor - clearer macro names

* iwyu - fix suggestion of "ceed/ceed.h" externally

* iwyu - lighter petsc headers

* iwyu - ceed/ceed.h -> ceed.h

* iwyu - cuda/hip include fixes

show more ...


# 2b730f8b 17-Nov-2022 Jeremy L Thompson <jeremy@jeremylt.org>

Switch to clang-format (#1051)

* style - switch to clang-format

* ci - use newer libxsmm

* action - update format action

* format - consistent use of {} for multi-line if/for

* make - re

Switch to clang-format (#1051)

* style - switch to clang-format

* ci - use newer libxsmm

* action - update format action

* format - consistent use of {} for multi-line if/for

* make - remove stray newline

* make - simpler 'make format' target

* ci - use newer libxsmm

* doc - minor release note claification

* minor - minor fix

* minor - minor fix

* minor - minor fix

* minor - minor fix

* make format

* format - less aggressive alignment rules

* tidy - check for argument name mismatches

* fix newline

* format - mirror Ratel update to .clang-format

* fix merge error

* fix merge conflict

* fix merge error

* drop style in .phony list

* Update .clang-format

Co-authored-by: Jed Brown <jed@jedbrown.org>

* apply updated format

Co-authored-by: Jed Brown <jed@jedbrown.org>

show more ...


# 9e201c85 23-Sep-2022 Yohann <dudouit1@llnl.gov>

Refactor `cuda-gen` and `hip-gen` backends. (#1050)

* Add TODO items.

* rough, but something like this?

* wip - cleaning up some warnings, but more remain

* wip - reorganize

* wip - miss

Refactor `cuda-gen` and `hip-gen` backends. (#1050)

* Add TODO items.

* rough, but something like this?

* wip - cleaning up some warnings, but more remain

* wip - reorganize

* wip - missing kernels

* wip - replace t1d

* fix some kernels

* another typo

* more

* another one

* closer

* define T_1D

* typosgit add .!

* WIP: changes to cuda-shared framework for new kernels

* fix output writing

* buffer fix

* buffer sizes

* WIP: fixes for 2 and 3D basis kernels

* minor

* fix weight kernel for 3d

* remove debugging output

* minor reorg

* fix includes

* enable collo grad for cuda-shared

* move quoted kernels

* renaming

* missed a rename

* small fix

* more naming consistency

* faster 'useCollograd=false' path in *-gen

* more style

* one last style fix

* clearer collograd condition

* Add gen basis kernels to hip-shared

* Try some changes to hip-shared basis block sizes for new kernels

* cuda - drop extra kernel arg

* cuda - fix collograd check logic

* update gen comment about parallelization

* tidy up fields struct definition

* tidy up structs even more

* Update hip-gen basis templates use and move other hip-gen device functions to jit-source

* Finish hip-gen basis template update; small style updates to match CUDA

* missing isStrided

* Update block size used in 3D weight for new shared kernels

* update release notes

Co-authored-by: Jeremy L Thompson <jeremy@jeremylt.org>
Co-authored-by: nbeams <246972+nbeams@users.noreply.github.com>

show more ...


12