History log of /libCEED/rust/libceed-sys/c-src/backends/magma/ceed-magma.c (Results 51 – 75 of 115)
Revision Date Author Comments
# 5f67fade 26-Jun-2020 Jeremy L Thompson <thompson.jeremy.luke@gmail.com>

spellcheck


# d7a256fb 25-Jun-2020 Jeremy L Thompson <thompson.jeremy.luke@gmail.com>

README - highlight reproducibility of backends


# 49fd234c 12-Jun-2020 Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com>

Convert CUDA ref/reg/shared E-Layout (#554)

* tests - update tests for multiple e-layouts

* CUDA - convert ref and reg restrictions to Q-layout

* CUDA - ref/reg/shared use gen/magma E-Layout f

Convert CUDA ref/reg/shared E-Layout (#554)

* tests - update tests for multiple e-layouts

* CUDA - convert ref and reg restrictions to Q-layout

* CUDA - ref/reg/shared use gen/magma E-Layout for multi elememnt basis apply and operator apply

* CUDA/MAGMA - drop eandqdiffer and separate MAGMA operator code

* CUDA - update operator comment

* reg - clarify read/write dofs/quads

* CUDA - drop dead code

show more ...


# e0582403 15-May-2020 abdelfattah83 <36712794+abdelfattah83@users.noreply.github.com>

Icl/magma queue (#524)

Update the MAGMA backend:

* add new specialized tensor basis kernels

* add batched DGEMM wrapper for non-tensor basis kernels

* switch backend to use MAGMA v2 inte

Icl/magma queue (#524)

Update the MAGMA backend:

* add new specialized tensor basis kernels

* add batched DGEMM wrapper for non-tensor basis kernels

* switch backend to use MAGMA v2 interface

Co-authored-by: nbeams <246972+nbeams@users.noreply.github.com>
Co-authored-by: Stan Tomov <tomov@eecs.utk.edu>

show more ...


# 65275b31 13-May-2020 valeriabarra <valeriabarra21@gmail.com>

Merge branch 'master' into valeria/NSfixes


# a8c028e3 07-May-2020 Natalie Beams <246972+nbeams@users.noreply.github.com>

CEED_STRIDES_BACKEND optimization for cuda-ref operator apply (#528)

* add check for backend stride status for input vectors

* add backend strides check for output vectors

* replace output cop

CEED_STRIDES_BACKEND optimization for cuda-ref operator apply (#528)

* add check for backend stride status for input vectors

* add backend strides check for output vectors

* replace output copy with elem restriction for none emode

* move input skip_restrict check to setup and never allocate E-vec if not needed

* add boolean variable for E/Q vector layout for
further optimization of output and add wrapper function in magma backend
to create a cuda-ref operator and change this state variable

* Add missing CeedChks

* style changes to better match cuda backends

* missed style change for evec check

* add CeedChk from PR #525 (merge conflict)

* make style changes

* adjust size of nqpts for non-tensor basis

show more ...


# 868539c2 04-Feb-2020 Natalie Beams <246972+nbeams@users.noreply.github.com>

Enable MAGMA non-tensor basis (#424)

* update magma backend files from magma-dev to new branch

* add skeleton for elem restrictions

* start putting code and files for nontensor case

* more

Enable MAGMA non-tensor basis (#424)

* update magma backend files from magma-dev to new branch

* add skeleton for elem restrictions

* start putting code and files for nontensor case

* more framework for adding magma elem restrictions

* typo/old code error fixes for interface and header file, etc

* add the writedofs kernels

* fix nonconstants - template them for now

* fix bugs in non-tensor basis apply (interp and weight)

* update magma backend files from magma-dev to new branch

* add skeleton for elem restrictions

* start putting code and files for nontensor case

* more framework for adding magma elem restrictions

* typo/old code error fixes for interface and header file, etc

* add the writedofs kernels

* fix nonconstants - template them for now

* fix bugs in non-tensor basis apply (interp and weight)

* fix incorrect merge conflict resolution of header file

* fix bugs in lmode=notranspose elem restrictions
and in copying of indices to device

* test simpler kernels for lmode=transpose elem restrict

* swap element and component ordering in tensor basis actions

* update comments in restriction kernels to match swapped ordering

* fix if statement to work with CEED_VECTOR_NONE instead of NULL

* minor code cleanup

* skip t204-7 for magma after changing E-vector layout

* remove commented old calls for now

* use magma set/get vectors instead of cuda memcopies

* remove dead code

* make style changes

Co-authored-by: Stan Tomov <tomov@eecs.utk.edu>
Co-authored-by: abdelfattah83 <36712794+abdelfattah83@users.noreply.github.com>

show more ...


# 7f5b9731 02-Oct-2019 Stan Tomov <tomov@eecs.utk.edu>

Magma dev optimizations (#111)

* makefile changes

* update magma backend

* magma qfunctions updated to new interface

* in the magmabackend we manage where pointers are - if on CPU, on some

Magma dev optimizations (#111)

* makefile changes

* update magma backend

* magma qfunctions updated to new interface

* in the magmabackend we manage where pointers are - if on CPU, on some cases we still need and may call the CPU code

* update the reflect changes in the API

* update the reflect changes in the API

* add the q functions for ex1.c

* Switch to CeedIntPow

* Fix merge errors

* Clean up Magma operator loops

* Move zeroing lvec

* fix bug in the rebase and add some qfunctions. This passes the tests now

* adding new files, changing -O to -O3

* new faster way of checking CPU vs. GPU pointers

* core magma device functions for basis apply

* new kernels for basis apply

* use the new magma_isdevptr function

* minor cleanup

* new headers and defs

* calling the new magma functions for basis apply

* undo O3, and change default magma directory

* use static

* use static

* modify the generator to add before __global__

* remove unnecessary header

* silence some warnings

* Makefile: restore NVCC and NVCCFLAGS to match master

* first pass as updating new Magma work, untested

* Use CUDA backend to dispatch

* Device memory for MAGMA

* Add copyright messages and tidy

* WIP: starting fresh on magma-dev-rebae. Add magma_is_devptr

* WIP: starting fresh on magma-dev-rebae. Fix build issue

* WIP: starting fresh on magma-dev-rebae. Fix build issue

* WIP: starting fresh on magma-dev-rebae. Fix build issue

* WIP: starting fresh on magma-dev-rebae. Fix this include

* WIP: starting fresh on magma-dev-rebae. Fix build issue.

* WIP: starting fresh on magma-dev-rebase. Mostly fixing compilation errors

* WIP: starting fresh on magma-dev-rebase. Disbale magma-basis for now

* WIP: starting fresh on magma-dev-rebase. Edit the required magma src files

* move CeedVector_Magma functions from magma-dev branch

* add includes

* fix build errors

* disable magma vector logic for now

* Remove reference in CeedDelegate

* add RestoreArray calls to CeedBasisApply_Magma

* add basis for magma

* magma batched operator

* Change batch calls to match Q-vector ordering

* minor cleanup of unused variable

* update magma portion of Makefile

* remove magma vectors

* remove unused contract variable

* change extern to CEED_INTERN

* merge in updates from master branch

* MAGMA - add lcov markers

* remove stray example script

show more ...


# 288c0443 13-Sep-2019 Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com>

QFunction Create by Name (#311)

This PR adds a QFunction gallery to libCEED with 1D, 2D, and 3D mass and Poisson operators.

Closes issue #37, issue #340

* Add QFunction gallery, rename focca

QFunction Create by Name (#311)

This PR adds a QFunction gallery to libCEED with 1D, 2D, and 3D mass and Poisson operators.

Closes issue #37, issue #340

* Add QFunction gallery, rename focca

* Gallery - add initial QFunctions

* Add a test for using the QF gallery

* Modify ex1 to use gallery

* Add multiple test configs to tap

* Move output to test directory

* Update junit

* Add OCCA galley exception

* Add ex2

* Update ex2 for dim->ncompx

* Gallery - modify to work for CUDA as is

* Update Documentation

* Gallery - typo fix

* Gallery - convention change, postappend qfunction family variant

* Gallery - update template with new name checking convention

* Gallery - condense diff3DBuild QFunction

* Gallery - rename diff -> poisson

* Gallery - clarify poisson3DBuild comment

* Gallery - use Pragma SIMD, store Qdata in Voigt convention

* Examples - Convert BP3-6 to Voigt convention

* Examples - add cl option to switch between header and gallery qfs in CEED examples

* Examples - clean up construction of QF name

* Gallery - Switch to PascalCase for gallery names

* Doc - fix function type page

* Interface - Make sure strncpy result is null terminated

* Gallery - Update Poisson 2/3D Apply to new QF body

* make style

* make style - fix worst style problems

* make style - add gallery to make style

* Doc - update documentation errors and inconsistencies

* Examples - test ex1 ex2 with and without gallary

* Examples - reduce testing of ex1/ex2 without gallery, clean up non-gallery qfunctions

* MFEM - revert another make style mistake

* Manual make style updates

* Doc - update function documentation page

* Style updates, document test numbering conventions

* doc: resolve ambiguous image location warning, allow more Dot nodes

* Tests - style and cast cleanup

* Tests - fix README indentation

show more ...


# a7724da3 24-May-2019 Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com>

Merge pull request #232 from CEED/offsetof-fix

Offsetof fix


# f8902d9e 24-May-2019 jeremylt <jeremy.thompson@colorado.edu>

VecCreate -> VectorCreate


# d4fd2798 18-May-2019 Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com>

Merge pull request #228 from CEED/rstr-block

Restriction Block


# be9261b7 28-Mar-2019 jeremylt <jeremy.thompson@colorado.edu>

Add ElemRestrictionApplyBlock


# f05116b9 14-Mar-2019 Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com>

Drop array argument from backend RestoreArray (#210)


# 52d6035f 13-Mar-2019 Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com>

Operator Composition (#197)

* Composite Operator for cpu/self family of backends

* Remove small leak

* Improve C tests

* Add composite operator to Fortran interface and tests

* Fix Fortr

Operator Composition (#197)

* Composite Operator for cpu/self family of backends

* Remove small leak

* Improve C tests

* Add composite operator to Fortran interface and tests

* Fix Fortran test missing destroys

* Fortran test okl files, currently not used

* fix error in composite ' add' flag logic

* Switch composite op tests to f90

* Check for operator type on utility functions

* Documentation and test cleanup

* Make Style

show more ...


# 9f0427d9 12-Jan-2019 Yohann <yohann.dudouit@gmail.com>

Cuda backend (#175)

Thanks-to: Steven Roberts
- for achieving most of the initial work, the code was well designed, clean, and pleasantly written.
Thanks-to: Jeremy Thompson
- for his constant

Cuda backend (#175)

Thanks-to: Steven Roberts
- for achieving most of the initial work, the code was well designed, clean, and pleasantly written.
Thanks-to: Jeremy Thompson
- for his constant support, exceptional patience, and the numerous relevant suggestions.

* Start cuda branch

* Start cuda branch

* Cuda backend works correctly for example 1

* More reliable operator destroy

* Fix cuda registration

* Makefile now works for cuda backend

* Start qfunction parallelization

* Remove extra cuda flags

* Cuda backend uses vector api instead of directly accessing internals

* Fix header from find and replace mistake

* Cuda qfunction callback working properly

* Cuda uses same integer pow function as other backends

* Use nvcc if available to support Cuda backend

* Remove extra memcpys from getting and restoring arrays

* MFEM examples work for cuda backend

* Optimized basis kernels to better utilize shared memory

* More kernel optimization

* Active/passive updates

* Make cuda kernels static to minimize external functions

* Fix cuda qfunction kernel loop condition

* Switch to NVRTC for cuda backend

* Add nelem argument to cuda basis apply

* First commit for the libParanumal backend

* Adds a function skeleton for the ceed-libparanumal-opearator.c

* Adds OperatorDestroy and OperatorSetupFields to the libParanumal backend.

* Adds some guidelines for the implementation of the backend.

* Partially implement OperatorSetup for libparanumal.

- The core of the OperatorSetup is written
- Adds a spec field to CeedQFunction_private

* Adds the CeedQFunctionCreateInteriorFromGallery.

- The gallery only contains a skeleton for "elliptic" for the moment.
- Comment some code unecessary for the moment.

* Change the default fields for elliptic.

* Add setters, remove impl header from CPU, OCCA backends

* Add global NUM_BACKEND, fix qf user pointer getter

* Improve operator field frees

* Update MAGMA backend

* Use Occa Vectors in the libParanumal backend.

* Typo Fix

* Vector inputs for BasisApply and QFApply; CPU backends, OCCA, and tests converted

* Implements the new version of CeedQFunctionApply_Cuda.

* Update the Cuda backend to PR174.

* Bug fix in Cuda backend.

- Replace sprintf by snprintf
- More careful use of the macro 'va_arg'

* Vector inputs for BasisApply and QFApply; CPU backends, OCCA, and tests converted

* Update MAGMA backend to vector inputs

* Modify restriction create in the cuda backend to handle memory correctly.

* Modify restriction destroy and apply of the cuda backend.

* Corrects a few typos in the cuda backend.

* Replace a CeedFree by a cudaFree...

* CeedVectorRestoreArrayRead was syncing unnecessarly data.

* CeedVectorRestoreArrayRead was syncing unnecessarly data.

* [FIX] Adds CeedVectorRestoreArray in the restriction of the cuda backend.

* Adds an error check.

* Handles indice==NULL for identity restriction.

* Adds an CeedElemRestrictionCreateBlocked_Cuda that errors.

* Adds VectorRestor in BasisApply.

* Attempt to make SetValue function.

* Adds the memState variable inside the CeedVectorCuda and uses it.

* Fix a bug that was passing the pointer instead of the address of
the pointer to CeedFree......

* Some cleaning.

* Fix a logic error in VectorGetArray.

- Now allocates an array whatever the memState is

* Fix: Basis apply checks if emode!=CEED_EVAL_WEIGHT before getting u array.

* Cleaning for PR to libCEED repo.

* Uses Setters instead of direct struct access.

* Use Getters instead of direct structure access.

* minor forgot to get ierr in after calling some functions.

* Forget to add the SetValue function in Cuda Vector...

* minor: Works even better if we give the right function to SetValue

* Fix: Set the right function for RestrictionBlocked...

* Replace some CeedChk with CeedChk_Cu

* Fix: Replace 'vec' by its length 'length'.

* Adds some CeedChk.

* Fix the Cuda_context_destroyed bug

* Adds error checking to cudaMemcpyH2D but not to D2H since it errors...

* Use Occa file approach to read Cuda QFunctions.

* Fix a few bugs

* Test a new approach to pass the qFunction fields.

* Remove typo in t400.cu and remove debugging printf.

* Append the Cuda Fields struct at the beginning of each qFunction .cu file.

* Add qFunctions for t500, t501 and t502.

* Correct cu functions for t502.

* Memcpy the ctx on the device at each Apply call.

* Checks errors in VectorSync.

* Modifies a bit the memState logic.

* Adds a Cuda implementation of Operator instead of using Ref.

* Remove some unnecessary GetArray in OperatorApply.

* Does a trick for CEED_EVAL_NONE output.

* Fix a bug in CEED_EVAL_WEIGHT.

* Applies the QFunction to all elements, not only the first one...

* A debugging commit.

* Fix: CEED_EVAL_WEIGHT use nelem in BasisApply_Cuda.

* Rewritten weight kernel.

* All C tests pass.

* Cleaning for PR.

* Remove unneeded commented code.

* Remove commented code.

* Remove the check on the pointer in RestoreArray.

* Fix a CeedFree bug.

* Fix the edata memory leak.

* Fix misuse of CeedFree.

* Allocate device memory if there is a magic context appearing due to Fortran.

* make style

* Adds cu files for petsc/bp1 mfem/bp1 and ceed/ex1.

* Remove a warning.

* Remove switch case fall-thourgh to remove warnings.

* Remive some bugs, make other bugs show up.

* Implement the Identity Restriction.

* Size correctly the restriction.

* Modify GPU restriction kernels instead of making dummy identity.

* Add cudaFree(0) before compiling to initialize the context (?!)

* Rewritten weight kernel.

* Fix typo in weight kernel.

* Fix typo in weight kernel.

* Add bp1.cu and bp3.cu for the petsc examples.

* Rewritten interp kernel for Cuda backend.

The interp kernel was not writting data in the layout that the
QFunction is expecting.

* Rewritten grad kernel for Cuda backend.

- Small fix on the interp kernel.
- The grad kernel was not writting data in the layout that the
QFunction is expecting.

* Fix the logic in interp kernel.

* Fix the shared memory size.

* Modify grad kernel to take into account the libCEED data layout.

* Add a cuda file for mfem/bp3.

* Add synchronisation to mfem bp1 and bp3.

* Fix the grad and weight kernel to have the correct data layout.

* Forgotten cu files for Fortran.

* Corrects some typos in the Cuda file for petsc/bp1.

* Add Cuda files for the new t401 test.

* Update the logic on the transfer of the qFunction ctx.

* Write petsc/bp1 in C++ instead of C.

* Minor fix: typo

* Add synchronization to petsc/bp1+bp3.

* Removes the sync on rho in petsc/bp1+bp3.

* Integrate Jeremy Thompson's remarks to the PR.

* Use CeedError instead of exit(1).

* Removes -lstdc++ and adds Ceed in front of DeviceSetValue function.

* Removes synchronization on 'u' in the Apply.

* minor

* make style

* Use the new context interface.

* Minor

* Minor.

* Minor.

* Make style using align-pointer=name

* Minor: some cleaning

* CeedQFunctionUser: write documentation

* Make NVCC compatible with new OPT compiler options

show more ...


# dc1dbf07 19-Dec-2018 Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com>

Merge pull request #174 from CEED/vec-inputs

Vec inputs


# 16383ffc 25-Nov-2018 jeremylt <jeremy.thompson@colorado.edu>

Update MAGMA backend to vector inputs


# 5c32accb 18-Dec-2018 Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com>

Merge pull request #172 from CEED/setters

Setters


# 5a51ea82 16-Nov-2018 jeremylt <jeremy.thompson@colorado.edu>

Update MAGMA backend


# 98326946 04-Dec-2018 Jed Brown <jed@jedbrown.org>

Merge pull request #170 from CEED/op-transpose-rstr

Add LMode field to CeedOperatorSetField


# 4dccadb6 30-Oct-2018 jeremylt <jeremy.thompson@colorado.edu>

Add lmode field to CeedOperatorSetField


# 0a5a520a 06-Nov-2018 Jed Brown <jed@jedbrown.org>

Merge branch 'getters' of github:ceed/libceed [PR #167]

* 'getters' of github:ceed/libceed:
Improved documentation
Add Operator/QFunction field getters
Update documentation
Separate to 3 hea

Merge branch 'getters' of github:ceed/libceed [PR #167]

* 'getters' of github:ceed/libceed:
Improved documentation
Add Operator/QFunction field getters
Update documentation
Separate to 3 header files
First round of getters

[Remove unnecessary ceed-impl.h in merge.]

show more ...


# d1bcdac9 23-Oct-2018 jeremylt <jeremy.thompson@colorado.edu>

Add Operator/QFunction field getters


# 9e1c8ed3 12-Sep-2018 Jed Brown <jed@jedbrown.org>

Merge branch 'remove-extra-ceed-args' [PR #148]

* remove-extra-ceed-args:
Restore ceed argument in TensorContractRef/Opt
Switch held ref ceed to delegate ceed recursively checked for
Refactor

Merge branch 'remove-extra-ceed-args' [PR #148]

* remove-extra-ceed-args:
Restore ceed argument in TensorContractRef/Opt
Switch held ref ceed to delegate ceed recursively checked for
Refactor to standardize backend create functions

show more ...


12345