History log of /libCEED/backends/ (Results 701 – 725 of 1139)
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
347d7e4c23-Jun-2020 Jeremy L Thompson <thompson.jeremy.luke@gmail.com>

CUDA - add small leak fix in QFunction JiT load

7df9421223-Jun-2020 Jeremy L Thompson <thompson.jeremy.luke@gmail.com>

CUDA - clean up includes

73b3ccaf23-Jun-2020 Jeremy L Thompson <thompson.jeremy.luke@gmail.com>

CUDA - clean up minor memory leak

599c6f3224-Jun-2020 Jeremy L Thompson <thompson.jeremy.luke@gmail.com>

XSMM - switch to LIBXSMM 143409c, reverted redefiniton of _Bool

0df135b422-Jun-2020 nbeams <246972+nbeams@users.noreply.github.com>

comment out the hiprtc-related calls causing memory errors

0131da1119-Jun-2020 Jeremy L Thompson <thompson.jeremy.luke@gmail.com>

XSMM - use v1.16

81a63d6f17-Jun-2020 nbeams <246972+nbeams@users.noreply.github.com>

fix size of options list for hiprtcCompile

9e9210b817-Jun-2020 Jeremy L Thompson <thompson.jeremy.luke@gmail.com>

Op - add AssembleAdd version of diagonal assembly functions, will be helpful for MFEM integration

2bba3ffa17-Jun-2020 Jeremy L Thompson <thompson.jeremy.luke@gmail.com>

Op - change LinearAssemble* to accept CeedVector instead of pointer to CeedVector, allows for less memory movement and interfaces with parent code better

30f4f45f16-Jun-2020 nbeams <246972+nbeams@users.noreply.github.com>

Experimental HIP port of cuda-ref, round 2, with ROCm3.5

82253e4815-Jun-2020 Jeremy L Thompson <thompson.jeremy.luke@gmail.com>

make style

80ac2e4315-Jun-2020 Jeremy L Thompson <thompson.jeremy.luke@gmail.com>

style - rename AssembleLinear* to LinearAssemble*

fd364f3815-Jun-2020 Jeremy L Thompson <thompson.jeremy.luke@gmail.com>

style - change Get*Status to Is*

c04a41a715-Jun-2020 Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com>

Composite Operator support for AssembleLinearDiag/PBDiag (#552)

* ref - add composite operator support for building diagonal/pb diagonal

* ref - add error for non-composite mult-field operator di

Composite Operator support for AssembleLinearDiag/PBDiag (#552)

* ref - add composite operator support for building diagonal/pb diagonal

* ref - add error for non-composite mult-field operator diagonal/pb diagonal assembly

* tap - add t538 exclusion because OCCA does not support galleries

* tests - adjust test cases for ceed examples for test coverage

* Op - fix documentation

show more ...

b1d7415312-Jun-2020 Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com>

CUDA/MAGMA - add AssembleLinearQFunction (#553)

* CUDA - refactor operator apply for CUDA assemble linear QF impl

* CUDA/MAGMA - add AssembleLinearQFunction

* make style

* CUDA - clean up a

CUDA/MAGMA - add AssembleLinearQFunction (#553)

* CUDA - refactor operator apply for CUDA assemble linear QF impl

* CUDA/MAGMA - add AssembleLinearQFunction

* make style

* CUDA - clean up assembleLinearQF after q/e layout refactor

* CUDA - fallback operator for cuda/gen to cuda/ref

* CUDA - use delegation for cuda/gen prefered memtype

show more ...

49fd234c12-Jun-2020 Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com>

Convert CUDA ref/reg/shared E-Layout (#554)

* tests - update tests for multiple e-layouts

* CUDA - convert ref and reg restrictions to Q-layout

* CUDA - ref/reg/shared use gen/magma E-Layout f

Convert CUDA ref/reg/shared E-Layout (#554)

* tests - update tests for multiple e-layouts

* CUDA - convert ref and reg restrictions to Q-layout

* CUDA - ref/reg/shared use gen/magma E-Layout for multi elememnt basis apply and operator apply

* CUDA/MAGMA - drop eandqdiffer and separate MAGMA operator code

* CUDA - update operator comment

* reg - clarify read/write dofs/quads

* CUDA - drop dead code

show more ...

d965c7a706-Jun-2020 Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com>

CPU Impl of AssemblePointBlockDiagonal (#503)

* Operator - add AssemblePointBlockDiagonal for CPU backends, with test

* CUDA - add point block diagonal not supported message

* make style

*

CPU Impl of AssemblePointBlockDiagonal (#503)

* Operator - add AssemblePointBlockDiagonal for CPU backends, with test

* CUDA - add point block diagonal not supported message

* make style

* Operator - improve point block description

* OCCA - explicitly remove OCCA fallback to CPU assembly functions, will update after new OCCA backend

* Op - remove gap removal in point block diagonal

* Op - update diagonal assembly documentation

* Update backends/ref/ceed-ref-operator.c

Co-authored-by: Jed Brown <jed@jedbrown.org>

* style - fix extra space in * with nopad

Co-authored-by: Jed Brown <jed@jedbrown.org>

show more ...

59ae912b21-May-2020 Jeremy L Thompson <thompson.jeremy.luke@gmail.com>

OCCA - small fixes highlighed by icc

9a2291e319-May-2020 Jeremy L Thompson <thompson.jeremy.luke@gmail.com>

CUDA - fix empty restriction support, must use GetStridedStatus over buggy indices pointer check

e058240315-May-2020 abdelfattah83 <36712794+abdelfattah83@users.noreply.github.com>

Icl/magma queue (#524)

Update the MAGMA backend:

* add new specialized tensor basis kernels

* add batched DGEMM wrapper for non-tensor basis kernels

* switch backend to use MAGMA v2 inte

Icl/magma queue (#524)

Update the MAGMA backend:

* add new specialized tensor basis kernels

* add batched DGEMM wrapper for non-tensor basis kernels

* switch backend to use MAGMA v2 interface

Co-authored-by: nbeams <246972+nbeams@users.noreply.github.com>
Co-authored-by: Stan Tomov <tomov@eecs.utk.edu>

show more ...

9ff7e16514-May-2020 Jed Brown <jed@jedbrown.org>

cuda-gen: avoid gcc-10 warning on use of strncat


/libCEED/.gitignore
/libCEED/.travis.yml
cuda-gen/ceed-cuda-gen-qfunction.c
/libCEED/examples/fluids/README.md
/libCEED/examples/fluids/advection.h
/libCEED/examples/fluids/advection2d.h
/libCEED/examples/fluids/densitycurrent.h
/libCEED/examples/fluids/index.rst
/libCEED/examples/fluids/navierstokes.c
/libCEED/examples/fluids/tests-output/fluids-navierstokes-explicit.bin
/libCEED/examples/fluids/tests-output/fluids-navierstokes-implicit-stab-none.bin
/libCEED/examples/fluids/tests-output/fluids-navierstokes-implicit-stab-supg.bin
/libCEED/examples/petsc/area.c
/libCEED/examples/petsc/bpssphere.c
/libCEED/examples/petsc/setup.h
/libCEED/examples/petsc/setuparea.h
/libCEED/examples/petsc/setupsphere.h
/libCEED/examples/solids/README.rst
/libCEED/examples/solids/elasticity.c
/libCEED/examples/solids/elasticity.h
/libCEED/examples/solids/src/cloptions.c
/libCEED/examples/solids/src/setuplibceed.c
/libCEED/include/ceedf.h
/libCEED/tests/t000-ceed-f.f90
/libCEED/tests/t001-ceed-f.f90
/libCEED/tests/t003-ceed-f.f90
/libCEED/tests/t100-vector-f.f90
/libCEED/tests/t101-vector-f.f90
/libCEED/tests/t102-vector-f.f90
/libCEED/tests/t103-vector-f.f90
/libCEED/tests/t104-vector-f.f90
/libCEED/tests/t105-vector-f.f90
/libCEED/tests/t106-vector-f.f90
/libCEED/tests/t107-vector-f.f90
/libCEED/tests/t108-vector-f.f90
/libCEED/tests/t200-elemrestriction-f.f90
/libCEED/tests/t201-elemrestriction-f.f90
/libCEED/tests/t202-elemrestriction-f.f90
/libCEED/tests/t208-elemrestriction-f.f90
/libCEED/tests/t209-elemrestriction-f.f90
/libCEED/tests/t210-elemrestriction-f.f90
/libCEED/tests/t211-elemrestriction-f.f90
/libCEED/tests/t212-elemrestriction-f.f90
/libCEED/tests/t300-basis-f.f90
/libCEED/tests/t301-basis-f.f90
/libCEED/tests/t302-basis-f.f90
/libCEED/tests/t304-basis-f.f90
/libCEED/tests/t305-basis-f.f90
/libCEED/tests/t306-basis-f.f90
/libCEED/tests/t313-basis-f.f90
/libCEED/tests/t314-basis-f.f90
/libCEED/tests/t320-basis-f.f90
/libCEED/tests/t322-basis-f.f90
/libCEED/tests/t323-basis-f.f90
/libCEED/tests/t400-qfunction-f.f90
/libCEED/tests/t401-qfunction-f.f90
/libCEED/tests/t402-qfunction-f.f90
/libCEED/tests/t410-qfunction-f.f90
/libCEED/tests/t411-qfunction-f.f90
/libCEED/tests/t412-qfunction-f.f90
/libCEED/tests/t413-qfunction-f.f90
/libCEED/tests/t500-operator-f.f90
/libCEED/tests/t501-operator-f.f90
/libCEED/tests/t503-operator-f.f90
/libCEED/tests/t504-operator-f.f90
/libCEED/tests/t505-operator-f.f90
/libCEED/tests/t506-operator-f.f90
/libCEED/tests/t510-operator-f.f90
/libCEED/tests/t511-operator-f.f90
/libCEED/tests/t520-operator-f.f90
/libCEED/tests/t521-operator-f.f90
/libCEED/tests/t522-operator-f.f90
/libCEED/tests/t523-operator-f.f90
/libCEED/tests/t524-operator-f.f90
/libCEED/tests/t530-operator-f.f90
/libCEED/tests/t531-operator-f.f90
/libCEED/tests/t532-operator-f.f90
/libCEED/tests/t533-operator-f.f90
/libCEED/tests/t534-operator-f.f90
/libCEED/tests/t535-operator-f.f90
/libCEED/tests/t536-operator-f.f90
/libCEED/tests/t540-operator-f.f90
/libCEED/tests/tap.sh
20aaa36508-May-2020 Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com>

XSMM - fix q=1 index computation (#531)

* XSMM - use hash table for kernel index lookup, add khash to libCEED

* Hash - add CeedHashIJKLM to simplify xsmm tensor hash table code

* simplify use

XSMM - fix q=1 index computation (#531)

* XSMM - use hash table for kernel index lookup, add khash to libCEED

* Hash - add CeedHashIJKLM to simplify xsmm tensor hash table code

* simplify use of hash table, use kernels as values

* XSMM: more idiomatic use of khash

* make style

Co-authored-by: Jed Brown <jed@jedbrown.org>

show more ...

a8c028e307-May-2020 Natalie Beams <246972+nbeams@users.noreply.github.com>

CEED_STRIDES_BACKEND optimization for cuda-ref operator apply (#528)

* add check for backend stride status for input vectors

* add backend strides check for output vectors

* replace output cop

CEED_STRIDES_BACKEND optimization for cuda-ref operator apply (#528)

* add check for backend stride status for input vectors

* add backend strides check for output vectors

* replace output copy with elem restriction for none emode

* move input skip_restrict check to setup and never allocate E-vec if not needed

* add boolean variable for E/Q vector layout for
further optimization of output and add wrapper function in magma backend
to create a cuda-ref operator and change this state variable

* Add missing CeedChks

* style changes to better match cuda backends

* missed style change for evec check

* add CeedChk from PR #525 (merge conflict)

* make style changes

* adjust size of nqpts for non-tensor basis

show more ...

05ddf11905-May-2020 Jeremy L Thompson <thompson.jeremy.luke@gmail.com>

MAGMA - support empty restrictions

274b8d2205-May-2020 Jeremy L Thompson <thompson.jeremy.luke@gmail.com>

CUDA - support empty restrictions

1...<<21222324252627282930>>...46