fix(sycl): Replaces pragma once with include guardsOpenCL doesn't really like pragma once evidently. I think we've been'getting away' with it previously as the JIT processing automaticallydoesn't
fix(sycl): Replaces pragma once with include guardsOpenCL doesn't really like pragma once evidently. I think we've been'getting away' with it previously as the JIT processing automaticallydoesn't allow for nested includes, but the same is not done for<ceed/types.h>?
show more ...
minor - header consistency
qf - prefer ceed/types.h over ceed.h in qf source
AtPoints - fix transpose basis apply on GPU
Add CeedBasisApplyAdd (#1644)* basis - add CeedBasisApplyAdd + CPU impl * basis - add ref GPU ApplyAdd * basis - add shared GPU ApplyAdd * basis - add MAGMA ApplyAdd * basis - add CeedB
Add CeedBasisApplyAdd (#1644)* basis - add CeedBasisApplyAdd + CPU impl * basis - add ref GPU ApplyAdd * basis - add shared GPU ApplyAdd * basis - add MAGMA ApplyAdd * basis - add CeedBasisApplyAddAtPoints + default impl * basis - add GPU ApplyAddAtPoints * tidy - add extra assert to fix clang-tidy * Apply suggestions from code review style - consistently use indexing over pointer arithmatic Co-authored-by: Zach Atkins <zach.atkins@colorado.edu> * style - more pointer fixes --------- Co-authored-by: Zach Atkins <zach.atkins@colorado.edu>
gpu - less explicit memory shuffling to build Chebyshev der
gpu - reduce write conflits for AtPoints basis operations
AtPoints - ease memory requirement
AtPoints - fix gpu thread usage
hip - add BasisApplyAtPoints
cuda - impl BasisApplyAtPoints
gpu - drop extra copy in ref tensor basis interp
style - bool name consistency
rstr - transpose AtPoints restriction fixes
Drop JiT Guards in Most QF Source (#1540)* backend - use pragma once * gallery - drop source header guards * ex - drop some qfunction source header guards * fluids - drop guards on all sin
Drop JiT Guards in Most QF Source (#1540)* backend - use pragma once * gallery - drop source header guards * ex - drop some qfunction source header guards * fluids - drop guards on all singly included headers * jit - drop most guards on backend JiT files * sycl - drop extra header guards * jit - enable #pragma once for QF source * fluids - use #pragma once for util/helper qf source * test - check different multiple includes * fluids - fix odd include * jit - update interface for building JiT string from multiple files
minor - update copyright headers
interface - add Ceed*ReturnCeed
Fix a few incorrect JIT header includes
Refactor GPU-backend element restriction source to only compile the required kernels, and do so lazily at first apply
Lazily compile GPU diagonal/point-block diagonal assembly kernels as neededAlso fixes CEED_SIZE -> USE_CEEDSIZE definition.
Add deterministic element restriction transpose GPU kernels for oriented and curl-oriented restrictions
Operator full assembly and diagonal assembly for cuda-ref and hip-ref backends for H(div) and H(curl) elements
Initial commit for cuda-ref and hip-ref backend support for oriented and curl-oriented element restrictionsFor now, the element restriction transpose for oriented or curl-oriented cases is always n
Initial commit for cuda-ref and hip-ref backend support for oriented and curl-oriented element restrictionsFor now, the element restriction transpose for oriented or curl-oriented cases is always non-deterministic.
H(div) and H(curl) basis support for cuda-ref and hip-ref backends
Minor cleanup to magma non-tensor kernel
12345