| 7f836c31 | 27-Oct-2024 |
James Wright <james@jameswright.xyz> |
fix(sycl): Replaces pragma once with include guards
OpenCL doesn't really like pragma once evidently. I think we've been 'getting away' with it previously as the JIT processing automatically doesn't
fix(sycl): Replaces pragma once with include guards
OpenCL doesn't really like pragma once evidently. I think we've been 'getting away' with it previously as the JIT processing automatically doesn't allow for nested includes, but the same is not done for <ceed/types.h>?
show more ...
|
| 6a96780f | 18-Oct-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
minor - header consistency |
| c0b5abf0 | 17-Oct-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
qf - prefer ceed/types.h over ceed.h in qf source |
| 111870fe | 04-Sep-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
AtPoints - fix transpose basis apply on GPU |
| db2becc9 | 13-Aug-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Add CeedBasisApplyAdd (#1644)
* basis - add CeedBasisApplyAdd + CPU impl
* basis - add ref GPU ApplyAdd
* basis - add shared GPU ApplyAdd
* basis - add MAGMA ApplyAdd
* basis - add CeedB
Add CeedBasisApplyAdd (#1644)
* basis - add CeedBasisApplyAdd + CPU impl
* basis - add ref GPU ApplyAdd
* basis - add shared GPU ApplyAdd
* basis - add MAGMA ApplyAdd
* basis - add CeedBasisApplyAddAtPoints + default impl
* basis - add GPU ApplyAddAtPoints
* tidy - add extra assert to fix clang-tidy
* Apply suggestions from code review
style - consistently use indexing over pointer arithmatic
Co-authored-by: Zach Atkins <zach.atkins@colorado.edu>
* style - more pointer fixes
---------
Co-authored-by: Zach Atkins <zach.atkins@colorado.edu>
show more ...
|
| 80c135a8 | 10-Jul-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - less explicit memory shuffling to build Chebyshev der |
| ad8059fc | 10-Jul-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - reduce write conflits for AtPoints basis operations |
| f7c9815f | 20-Jun-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
AtPoints - ease memory requirement |
| 2d10e82c | 17-Jun-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
AtPoints - fix gpu thread usage |
| 1c21e869 | 11-Jun-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
hip - add BasisApplyAtPoints |
| 34d14614 | 30-May-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
cuda - impl BasisApplyAtPoints |
| 2d903c70 | 12-Jun-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - drop extra copy in ref tensor basis interp |
| 0b2e4913 | 12-Jun-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
style - bool name consistency |
| 0b63de31 | 17-May-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
rstr - transpose AtPoints restriction fixes |
| 509d4af6 | 28-Mar-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Drop JiT Guards in Most QF Source (#1540)
* backend - use pragma once
* gallery - drop source header guards
* ex - drop some qfunction source header guards
* fluids - drop guards on all sin
Drop JiT Guards in Most QF Source (#1540)
* backend - use pragma once
* gallery - drop source header guards
* ex - drop some qfunction source header guards
* fluids - drop guards on all singly included headers
* jit - drop most guards on backend JiT files
* sycl - drop extra header guards
* jit - enable #pragma once for QF source
* fluids - use #pragma once for util/helper qf source
* test - check different multiple includes
* fluids - fix odd include
* jit - update interface for building JiT string from multiple files
show more ...
|
| 5aed82e4 | 27-Mar-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
minor - update copyright headers |
| 6e536b99 | 08-Mar-2024 |
Jeremy L Thompson <jeremy@jeremylt.org> |
interface - add Ceed*ReturnCeed |
| 1b7492f8 | 22-Jan-2024 |
Sebastian Grimberg <sjg@amazon.com> |
Fix a few incorrect JIT header includes |
| cf8cbdd6 | 22-Jan-2024 |
Sebastian Grimberg <sjg@amazon.com> |
Refactor GPU-backend element restriction source to only compile the required kernels, and do so lazily at first apply |
| cbfe683a | 20-Jan-2024 |
Sebastian Grimberg <sjg@amazon.com> |
Lazily compile GPU diagonal/point-block diagonal assembly kernels as needed
Also fixes CEED_SIZE -> USE_CEEDSIZE definition. |
| 7aa91133 | 08-Nov-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Add deterministic element restriction transpose GPU kernels for oriented and curl-oriented restrictions |
| 004e4986 | 16-Aug-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Operator full assembly and diagonal assembly for cuda-ref and hip-ref backends for H(div) and H(curl) elements |
| dce49693 | 15-Aug-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Initial commit for cuda-ref and hip-ref backend support for oriented and curl-oriented element restrictions
For now, the element restriction transpose for oriented or curl-oriented cases is always n
Initial commit for cuda-ref and hip-ref backend support for oriented and curl-oriented element restrictions
For now, the element restriction transpose for oriented or curl-oriented cases is always non-deterministic.
show more ...
|
| d075f50b | 11-Aug-2023 |
Sebastian Grimberg <sjg@amazon.com> |
H(div) and H(curl) basis support for cuda-ref and hip-ref backends |
| 86ad04cc | 06-Nov-2023 |
Sebastian Grimberg <sjg@amazon.com> |
Minor cleanup to magma non-tensor kernel |