| #
56cf2fbb
|
| 10-Jul-2019 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Merge pull request #291 from CEED/rstr-mult
Add ElemRestrGetMultiplicity and Tests
|
| #
1469ee4d
|
| 10-Jul-2019 |
jeremylt <jeremy.thompson@colorado.edu> |
Add ElemRestrGetMult and tests
|
| #
1226057f
|
| 27-Jun-2019 |
Yohann Dudouit <yohann.dudouit@gmail.com> |
Merge branch 'master' into yohann/cuda-restr-opt
Conflicts: backends/cuda-reg/ceed-cuda-reg-restriction.c backends/cuda-shared/ceed-cuda-shared-basis.c
|
| #
29187ef8
|
| 19-Jun-2019 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Merge pull request #274 from CEED/underintegrate-basis-fix
Fix Underinterpolation in /cpu/self Backends
|
| #
a7bd39da
|
| 10-Jun-2019 |
jeremylt <jeremy.thompson@colorado.edu> |
Fix underinterpolation mode for /cpu/self backends
|
| #
d4fd2798
|
| 18-May-2019 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Merge pull request #228 from CEED/rstr-block
Restriction Block
|
| #
be9261b7
|
| 28-Mar-2019 |
jeremylt <jeremy.thompson@colorado.edu> |
Add ElemRestrictionApplyBlock
|
| #
50463c24
|
| 14-May-2019 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Merge pull request #177 from CEED/t106-vec
Adds a Vector unit test using CEED_MEM_DEVICE.
|
| #
c8b9fe72
|
| 30-Apr-2019 |
jeremylt <jeremy.thompson@colorado.edu> |
Add offset parameter in Fortran VecSetArray
|
| #
0f3038fc
|
| 28-Mar-2019 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Merge pull request #225 from CEED/mem-type-preferred
Add CeedGetPreferredMemType
|
| #
c907536f
|
| 27-Mar-2019 |
jeremylt <jeremy.thompson@colorado.edu> |
Add CeedGetPreferredMemType
|
| #
54540941
|
| 14-Mar-2019 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Add CeedVectorSyncArray (#214)
|
| #
52d6035f
|
| 13-Mar-2019 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Operator Composition (#197)
* Composite Operator for cpu/self family of backends
* Remove small leak
* Improve C tests
* Add composite operator to Fortran interface and tests
* Fix Fortr
Operator Composition (#197)
* Composite Operator for cpu/self family of backends
* Remove small leak
* Improve C tests
* Add composite operator to Fortran interface and tests
* Fix Fortran test missing destroys
* Fortran test okl files, currently not used
* fix error in composite ' add' flag logic
* Switch composite op tests to f90
* Check for operator type on utility functions
* Documentation and test cleanup
* Make Style
show more ...
|
| #
b99f7525
|
| 11-Mar-2019 |
Valeria Barra <39932030+valeriabarra@users.noreply.github.com> |
Merge pull request #209 from CEED/jed/astyle
Make Style Updates
|
| #
cdf4f918
|
| 09-Mar-2019 |
jeremylt <jeremy.thompson@colorado.edu> |
Apply style changes
|
| #
9f0427d9
|
| 12-Jan-2019 |
Yohann <yohann.dudouit@gmail.com> |
Cuda backend (#175)
Thanks-to: Steven Roberts
- for achieving most of the initial work, the code was well designed, clean, and pleasantly written.
Thanks-to: Jeremy Thompson
- for his constant
Cuda backend (#175)
Thanks-to: Steven Roberts
- for achieving most of the initial work, the code was well designed, clean, and pleasantly written.
Thanks-to: Jeremy Thompson
- for his constant support, exceptional patience, and the numerous relevant suggestions.
* Start cuda branch
* Start cuda branch
* Cuda backend works correctly for example 1
* More reliable operator destroy
* Fix cuda registration
* Makefile now works for cuda backend
* Start qfunction parallelization
* Remove extra cuda flags
* Cuda backend uses vector api instead of directly accessing internals
* Fix header from find and replace mistake
* Cuda qfunction callback working properly
* Cuda uses same integer pow function as other backends
* Use nvcc if available to support Cuda backend
* Remove extra memcpys from getting and restoring arrays
* MFEM examples work for cuda backend
* Optimized basis kernels to better utilize shared memory
* More kernel optimization
* Active/passive updates
* Make cuda kernels static to minimize external functions
* Fix cuda qfunction kernel loop condition
* Switch to NVRTC for cuda backend
* Add nelem argument to cuda basis apply
* First commit for the libParanumal backend
* Adds a function skeleton for the ceed-libparanumal-opearator.c
* Adds OperatorDestroy and OperatorSetupFields to the libParanumal backend.
* Adds some guidelines for the implementation of the backend.
* Partially implement OperatorSetup for libparanumal.
- The core of the OperatorSetup is written
- Adds a spec field to CeedQFunction_private
* Adds the CeedQFunctionCreateInteriorFromGallery.
- The gallery only contains a skeleton for "elliptic" for the moment.
- Comment some code unecessary for the moment.
* Change the default fields for elliptic.
* Add setters, remove impl header from CPU, OCCA backends
* Add global NUM_BACKEND, fix qf user pointer getter
* Improve operator field frees
* Update MAGMA backend
* Use Occa Vectors in the libParanumal backend.
* Typo Fix
* Vector inputs for BasisApply and QFApply; CPU backends, OCCA, and tests converted
* Implements the new version of CeedQFunctionApply_Cuda.
* Update the Cuda backend to PR174.
* Bug fix in Cuda backend.
- Replace sprintf by snprintf
- More careful use of the macro 'va_arg'
* Vector inputs for BasisApply and QFApply; CPU backends, OCCA, and tests converted
* Update MAGMA backend to vector inputs
* Modify restriction create in the cuda backend to handle memory correctly.
* Modify restriction destroy and apply of the cuda backend.
* Corrects a few typos in the cuda backend.
* Replace a CeedFree by a cudaFree...
* CeedVectorRestoreArrayRead was syncing unnecessarly data.
* CeedVectorRestoreArrayRead was syncing unnecessarly data.
* [FIX] Adds CeedVectorRestoreArray in the restriction of the cuda backend.
* Adds an error check.
* Handles indice==NULL for identity restriction.
* Adds an CeedElemRestrictionCreateBlocked_Cuda that errors.
* Adds VectorRestor in BasisApply.
* Attempt to make SetValue function.
* Adds the memState variable inside the CeedVectorCuda and uses it.
* Fix a bug that was passing the pointer instead of the address of
the pointer to CeedFree......
* Some cleaning.
* Fix a logic error in VectorGetArray.
- Now allocates an array whatever the memState is
* Fix: Basis apply checks if emode!=CEED_EVAL_WEIGHT before getting u array.
* Cleaning for PR to libCEED repo.
* Uses Setters instead of direct struct access.
* Use Getters instead of direct structure access.
* minor forgot to get ierr in after calling some functions.
* Forget to add the SetValue function in Cuda Vector...
* minor: Works even better if we give the right function to SetValue
* Fix: Set the right function for RestrictionBlocked...
* Replace some CeedChk with CeedChk_Cu
* Fix: Replace 'vec' by its length 'length'.
* Adds some CeedChk.
* Fix the Cuda_context_destroyed bug
* Adds error checking to cudaMemcpyH2D but not to D2H since it errors...
* Use Occa file approach to read Cuda QFunctions.
* Fix a few bugs
* Test a new approach to pass the qFunction fields.
* Remove typo in t400.cu and remove debugging printf.
* Append the Cuda Fields struct at the beginning of each qFunction .cu file.
* Add qFunctions for t500, t501 and t502.
* Correct cu functions for t502.
* Memcpy the ctx on the device at each Apply call.
* Checks errors in VectorSync.
* Modifies a bit the memState logic.
* Adds a Cuda implementation of Operator instead of using Ref.
* Remove some unnecessary GetArray in OperatorApply.
* Does a trick for CEED_EVAL_NONE output.
* Fix a bug in CEED_EVAL_WEIGHT.
* Applies the QFunction to all elements, not only the first one...
* A debugging commit.
* Fix: CEED_EVAL_WEIGHT use nelem in BasisApply_Cuda.
* Rewritten weight kernel.
* All C tests pass.
* Cleaning for PR.
* Remove unneeded commented code.
* Remove commented code.
* Remove the check on the pointer in RestoreArray.
* Fix a CeedFree bug.
* Fix the edata memory leak.
* Fix misuse of CeedFree.
* Allocate device memory if there is a magic context appearing due to Fortran.
* make style
* Adds cu files for petsc/bp1 mfem/bp1 and ceed/ex1.
* Remove a warning.
* Remove switch case fall-thourgh to remove warnings.
* Remive some bugs, make other bugs show up.
* Implement the Identity Restriction.
* Size correctly the restriction.
* Modify GPU restriction kernels instead of making dummy identity.
* Add cudaFree(0) before compiling to initialize the context (?!)
* Rewritten weight kernel.
* Fix typo in weight kernel.
* Fix typo in weight kernel.
* Add bp1.cu and bp3.cu for the petsc examples.
* Rewritten interp kernel for Cuda backend.
The interp kernel was not writting data in the layout that the
QFunction is expecting.
* Rewritten grad kernel for Cuda backend.
- Small fix on the interp kernel.
- The grad kernel was not writting data in the layout that the
QFunction is expecting.
* Fix the logic in interp kernel.
* Fix the shared memory size.
* Modify grad kernel to take into account the libCEED data layout.
* Add a cuda file for mfem/bp3.
* Add synchronisation to mfem bp1 and bp3.
* Fix the grad and weight kernel to have the correct data layout.
* Forgotten cu files for Fortran.
* Corrects some typos in the Cuda file for petsc/bp1.
* Add Cuda files for the new t401 test.
* Update the logic on the transfer of the qFunction ctx.
* Write petsc/bp1 in C++ instead of C.
* Minor fix: typo
* Add synchronization to petsc/bp1+bp3.
* Removes the sync on rho in petsc/bp1+bp3.
* Integrate Jeremy Thompson's remarks to the PR.
* Use CeedError instead of exit(1).
* Removes -lstdc++ and adds Ceed in front of DeviceSetValue function.
* Removes synchronization on 'u' in the Apply.
* minor
* make style
* Use the new context interface.
* Minor
* Minor.
* Minor.
* Make style using align-pointer=name
* Minor: some cleaning
* CeedQFunctionUser: write documentation
* Make NVCC compatible with new OPT compiler options
show more ...
|
| #
b3e799f9
|
| 09-Jan-2019 |
Jed Brown <jed@jedbrown.org> |
Merge pull request #188 from CEED/fort-mem-leak
|
| #
c4b814ba
|
| 28-Dec-2018 |
jeremylt <jeremy.thompson@colorado.edu> |
Fix small mem leak in Fortran QFunctions
|
| #
1b435e3e
|
| 27-Dec-2018 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Merge pull request #185 from CEED/fortran-ctx
Improve QFunction ctx for Fortran interface
|
| #
1e35832b
|
| 19-Dec-2018 |
jeremylt <jeremy.thompson@colorado.edu> |
Add t401 to test QFunction with context
|
| #
418fb8c2
|
| 19-Dec-2018 |
jeremylt <jeremy.thompson@colorado.edu> |
Improve QFunction ctx for Fortran interface
|
| #
dc1dbf07
|
| 19-Dec-2018 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Merge pull request #174 from CEED/vec-inputs
Vec inputs
|
| #
45918e5c
|
| 12-Dec-2018 |
jeremylt <jeremy.thompson@colorado.edu> |
Make style
|
| #
aedaa0e5
|
| 19-Nov-2018 |
jeremylt <jeremy.thompson@colorado.edu> |
Vector inputs for BasisApply and QFApply; CPU backends, OCCA, and tests converted
|
| #
5c32accb
|
| 18-Dec-2018 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Merge pull request #172 from CEED/setters
Setters
|