| 288c0443 | 13-Sep-2019 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
QFunction Create by Name (#311)
This PR adds a QFunction gallery to libCEED with 1D, 2D, and 3D mass and Poisson operators.
Closes issue #37, issue #340
* Add QFunction gallery, rename focca
QFunction Create by Name (#311)
This PR adds a QFunction gallery to libCEED with 1D, 2D, and 3D mass and Poisson operators.
Closes issue #37, issue #340
* Add QFunction gallery, rename focca
* Gallery - add initial QFunctions
* Add a test for using the QF gallery
* Modify ex1 to use gallery
* Add multiple test configs to tap
* Move output to test directory
* Update junit
* Add OCCA galley exception
* Add ex2
* Update ex2 for dim->ncompx
* Gallery - modify to work for CUDA as is
* Update Documentation
* Gallery - typo fix
* Gallery - convention change, postappend qfunction family variant
* Gallery - update template with new name checking convention
* Gallery - condense diff3DBuild QFunction
* Gallery - rename diff -> poisson
* Gallery - clarify poisson3DBuild comment
* Gallery - use Pragma SIMD, store Qdata in Voigt convention
* Examples - Convert BP3-6 to Voigt convention
* Examples - add cl option to switch between header and gallery qfs in CEED examples
* Examples - clean up construction of QF name
* Gallery - Switch to PascalCase for gallery names
* Doc - fix function type page
* Interface - Make sure strncpy result is null terminated
* Gallery - Update Poisson 2/3D Apply to new QF body
* make style
* make style - fix worst style problems
* make style - add gallery to make style
* Doc - update documentation errors and inconsistencies
* Examples - test ex1 ex2 with and without gallary
* Examples - reduce testing of ex1/ex2 without gallery, clean up non-gallery qfunctions
* MFEM - revert another make style mistake
* Manual make style updates
* Doc - update function documentation page
* Style updates, document test numbering conventions
* doc: resolve ambiguous image location warning, allow more Dot nodes
* Tests - style and cast cleanup
* Tests - fix README indentation
show more ...
|
| ee07ded2 | 11-Sep-2019 |
Valeria Barra <39932030+valeriabarra@users.noreply.github.com> |
Add CeedPragmaOMP to bps (#338)
* Convert petsc BP3&4 to loops
* Update petsc/bp4.h looping
* Switch to CeedPragmaSIMD and make examples/petsc/bp3.h consistent with bp4.h
Remove CeedPragm
Add CeedPragmaOMP to bps (#338)
* Convert petsc BP3&4 to loops
* Update petsc/bp4.h looping
* Switch to CeedPragmaSIMD and make examples/petsc/bp3.h consistent with bp4.h
Remove CeedPragmaOMP directive in Nek example and update documentation
* Remove restric qualifier in petsc/bp3.h and update documentation
show more ...
|
| 4d537eea | 02-Sep-2019 |
Yohann <dudouit1@llnl.gov> |
Single Source QFunction (#304)
Introduce a new macro CEED_QFUNCTION that allows to define qFunctions in a single source code independently of the targeted backend.
Thanks-to: Jeremy Thompson
Tha
Single Source QFunction (#304)
Introduce a new macro CEED_QFUNCTION that allows to define qFunctions in a single source code independently of the targeted backend.
Thanks-to: Jeremy Thompson
Thanks-to: Jed Brown
This work is the result of a fruitful discussion between Jed Brown, Jeremy Thompson and Yohann Dudouit. Jeremy Thompson also implemented important features in this commit and was very active and helpful all along the progress of this work.
[NEWS] Breaking change: QFunctionField parameter 'ncomp' changed to 'size'. This change requires setting the previous value of 'ncomp' to 'ncomp*dim' when adding a QFunctionField with eval mode 'CEED_EVAL_GRAD'.
* First steps toward cuda-gen backend!
* Closer to real code generation.
* Generated code should be ready for nvrtc.
* The code generation skeleton is ready.
* Hack with the qfunction to make the operator kernel compile.
* Some tweaks in the makefile + Input fields structure change.
* Remove using cout.
* 1d interp and grad device functions.
* 1d readDofs, readQuads, writeDofs, writeQuads.
* Remove dead code.
* readDofs, readQuads, writeDofs, writeQuads for 2d and 3d
* 2d interp and grad
* 3d interp and grad
* - weight functions for 1d,2d,3d
- link the indices to the kernel
- link the fields to the kernel
- link the basis to the kernel
* Add the qFunction reader + inlining
* Add qf files for the tests.
* Add qf file for ceed/ex1
* Add qf file for mfem/bp1
* All tests pass.
* Add qFunction for mfem/bp3, petsc/bp1, and petsc/bp3.
* mfem/bp1 passes + remove dead code
* Fix a bug in n_quads_out for writeQuads
* mfem/bp3 passes.
* All tests all examples pass.
* Temporary tweaks for mfem benchmarking
* Add Context management.
* Modify .qf files to take into account the context.
* Enable optimizations.
* First set of optimization for 2D and 3D.
* double pointer format for the qFunction.
* Change the .qf files to have the same code as the C functions.
* Make previous Cuda backends use .qf files.
* Add a return value to qFunctions.
* Make cpu backends use .qf files.
* Minor: clean commented code.
* Add guarded math.h for petsc examples.
* Remove previous nek qf files.
* Remove .cu files.
* Remove .qf files.
* Remove dead code in the tests.
* make style
* Make style fix.
* more make style fixes.
* CEED_QFUNCTION - improve macro for CPU filenames
* Add CEED_QFUNCTION macro to navierstokes.c
* Fix PETSc gitignore
* Change default NS problemtype to density_current (#307) in navierstokes.c
* Fix petsc bp1.h
* Real Fix for petsc bp1.h...
* fix
* README - Add /gpu/cuda/gen
* PETSc - Update dmplex example to use *_loc
* cuda/reg - fix typo
* Revert a couple of small changes
* Fix a bug in mfem bp3 similar to the previous bug in petsc bp3.
* Make PETSc qfunctions look closer to master, and minor style for debugging.
* More uniformity changes
* Fix a strange CUDA_OUT_OF_RESSOURCE bug.
* NS - fix fname variables
* Use a different convention for qFunction ncomp.
* update cuda-gen backend and bpsdmplex.
* PETSc - style update
* update mfem bp1 and bp3.
* Interface - Use size instead of ncomp for QFunction fields
* update ceed example and tests.
* Tests - Update ncomp to size
* CPU Backends - Update ncomp to size
* CPU Backends - style
* Nek - Update ncomp to size
* Opt - fix style
* CUDA - update ncomp to size
* Doc - Update API documentation for QFunction \ncomp->size
* OCCA - Patch QFunction ncomp -> size, work but revamp will be better
* OCCA - assert dim>0 for clang-tidy
* CUDA - Change GetNumComp to GetSize
* Basis - Shift check for dim > 0 to interface
* Doc update
* Update NS field size
* NS - Fix problem options
show more ...
|
| f90c8643 | 22-Aug-2019 |
jeremylt <jeremy.thompson@colorado.edu> |
Make style |
| 8795c945 | 22-Aug-2019 |
jeremylt <jeremy.thompson@colorado.edu> |
Rename NDoF to NNodes and style updates |
| 8c91a0c9 | 27-Aug-2019 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Documentation Typo Fixes (#321)
* Basis - fix typo in doc
* Ceed - fix typo in doc
* Operator - fix typo in doc
* QFunction - fix typo in doc
* Ceed.h - fix typos in doc
* Ref - fix t
Documentation Typo Fixes (#321)
* Basis - fix typo in doc
* Ceed - fix typo in doc
* Operator - fix typo in doc
* QFunction - fix typo in doc
* Ceed.h - fix typos in doc
* Ref - fix typos in doc
* Blocked - fix typos in doc
* READMEs - typo fixes
show more ...
|
| 819eb1b3 | 30-Jul-2019 |
jeremylt <jeremy.thompson@colorado.edu> |
PETSc DMPlex BP1-6 working in unified code |
| f02ca4a2 | 12-Jul-2019 |
Jed Brown <jed@jedbrown.org> |
Refine PETSc DMPlex BPs
This currently only views shape, but could some day learn to view values. Adding to public interface because it's a useful tool for debugging user code. |
| 241a4b83 | 25-Jul-2019 |
Yohann <yohann.dudouit@gmail.com> |
Full jit compiled operator: cuda-gen backend (#275)
* First steps toward cuda-gen backend!
* Closer to real code generation.
* Generated code should be ready for nvrtc.
* The code generatio
Full jit compiled operator: cuda-gen backend (#275)
* First steps toward cuda-gen backend!
* Closer to real code generation.
* Generated code should be ready for nvrtc.
* The code generation skeleton is ready.
* Hack with the qfunction to make the operator kernel compile.
* Some tweaks in the makefile + Input fields structure change.
* Remove using cout.
* 1d interp and grad device functions.
* 1d readDofs, readQuads, writeDofs, writeQuads.
* Remove dead code.
* readDofs, readQuads, writeDofs, writeQuads for 2d and 3d
* 2d interp and grad
* 3d interp and grad
* - weight functions for 1d,2d,3d
- link the indices to the kernel
- link the fields to the kernel
- link the basis to the kernel
* Add the qFunction reader + inlining
* Add qf files for the tests.
* Add qf file for ceed/ex1
* Add qf file for mfem/bp1
* All tests pass.
* Add qFunction for mfem/bp3, petsc/bp1, and petsc/bp3.
* mfem/bp1 passes + remove dead code
* Fix a bug in n_quads_out for writeQuads
* mfem/bp3 passes.
* All tests all examples pass.
* Temporary tweaks for mfem benchmarking
* Add Context management.
* Modify .qf files to take into account the context.
* Enable optimizations.
* First set of optimization for 2D and 3D.
* Makefile tweaks and destructor code.
* make style.
* Add -MP flag.
* Fix linking issues with the tests.
* Update .qf files for the tests.
* Add .qf files for nek5000 examples.
* Use shared memory for B and G matrices.
* Fix bug introduced in previous commit.
show more ...
|
| 1469ee4d | 10-Jul-2019 |
jeremylt <jeremy.thompson@colorado.edu> |
Add ElemRestrGetMult and tests |
| a7bd39da | 10-Jun-2019 |
jeremylt <jeremy.thompson@colorado.edu> |
Fix underinterpolation mode for /cpu/self backends |
| a4999edd | 24-May-2019 |
jeremylt <jeremy.thompson@colorado.edu> |
Update Ceed Delegate refrencing |
| aefd8378 | 29-Apr-2019 |
jeremylt <jeremy.thompson@colorado.edu> |
Add delegates for specific objects |
| 683faae0 | 26-May-2019 |
Jed Brown <jed@jedbrown.org> |
make tidy: static analysis using clang-tidy
* Add "make tidy" and "make interface/ceed.c.tidy" targets to Makefile. * Use CPPFLAGS instead of CFLAGS for preprocessor flags. * For __clang__, convince
make tidy: static analysis using clang-tidy
* Add "make tidy" and "make interface/ceed.c.tidy" targets to Makefile. * Use CPPFLAGS instead of CFLAGS for preprocessor flags. * For __clang__, convince compiler that CeedError always returns nonzero. * Fix two minor issues detected by clang-tidy (missing va_end and an unnecessary or possibly-misleading guard).
Resolves issue #193
show more ...
|
| f8902d9e | 24-May-2019 |
jeremylt <jeremy.thompson@colorado.edu> |
VecCreate -> VectorCreate |
| 6e79d475 | 01-Apr-2019 |
jeremylt <jeremy.thompson@colorado.edu> |
Dynamically allocate Ceed function table |
| be9261b7 | 28-Mar-2019 |
jeremylt <jeremy.thompson@colorado.edu> |
Add ElemRestrictionApplyBlock |
| c907536f | 27-Mar-2019 |
jeremylt <jeremy.thompson@colorado.edu> |
Add CeedGetPreferredMemType |
| c71e1dcd | 20-Mar-2019 |
jeremylt <jeremy.thompson@colorado.edu> |
Add Basis argument to TensorContractCreate |
| 54540941 | 14-Mar-2019 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Add CeedVectorSyncArray (#214) |
| f05116b9 | 14-Mar-2019 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Drop array argument from backend RestoreArray (#210) |
| 2f86a920 | 13-Mar-2019 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Add CeedTensorContract object (#211) |
| 52d6035f | 13-Mar-2019 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Operator Composition (#197)
* Composite Operator for cpu/self family of backends
* Remove small leak
* Improve C tests
* Add composite operator to Fortran interface and tests
* Fix Fortr
Operator Composition (#197)
* Composite Operator for cpu/self family of backends
* Remove small leak
* Improve C tests
* Add composite operator to Fortran interface and tests
* Fix Fortran test missing destroys
* Fortran test okl files, currently not used
* fix error in composite ' add' flag logic
* Switch composite op tests to f90
* Check for operator type on utility functions
* Documentation and test cleanup
* Make Style
show more ...
|
| cdf4f918 | 09-Mar-2019 |
jeremylt <jeremy.thompson@colorado.edu> |
Apply style changes |
| 9f0427d9 | 12-Jan-2019 |
Yohann <yohann.dudouit@gmail.com> |
Cuda backend (#175)
Thanks-to: Steven Roberts
- for achieving most of the initial work, the code was well designed, clean, and pleasantly written.
Thanks-to: Jeremy Thompson
- for his constant
Cuda backend (#175)
Thanks-to: Steven Roberts
- for achieving most of the initial work, the code was well designed, clean, and pleasantly written.
Thanks-to: Jeremy Thompson
- for his constant support, exceptional patience, and the numerous relevant suggestions.
* Start cuda branch
* Start cuda branch
* Cuda backend works correctly for example 1
* More reliable operator destroy
* Fix cuda registration
* Makefile now works for cuda backend
* Start qfunction parallelization
* Remove extra cuda flags
* Cuda backend uses vector api instead of directly accessing internals
* Fix header from find and replace mistake
* Cuda qfunction callback working properly
* Cuda uses same integer pow function as other backends
* Use nvcc if available to support Cuda backend
* Remove extra memcpys from getting and restoring arrays
* MFEM examples work for cuda backend
* Optimized basis kernels to better utilize shared memory
* More kernel optimization
* Active/passive updates
* Make cuda kernels static to minimize external functions
* Fix cuda qfunction kernel loop condition
* Switch to NVRTC for cuda backend
* Add nelem argument to cuda basis apply
* First commit for the libParanumal backend
* Adds a function skeleton for the ceed-libparanumal-opearator.c
* Adds OperatorDestroy and OperatorSetupFields to the libParanumal backend.
* Adds some guidelines for the implementation of the backend.
* Partially implement OperatorSetup for libparanumal.
- The core of the OperatorSetup is written
- Adds a spec field to CeedQFunction_private
* Adds the CeedQFunctionCreateInteriorFromGallery.
- The gallery only contains a skeleton for "elliptic" for the moment.
- Comment some code unecessary for the moment.
* Change the default fields for elliptic.
* Add setters, remove impl header from CPU, OCCA backends
* Add global NUM_BACKEND, fix qf user pointer getter
* Improve operator field frees
* Update MAGMA backend
* Use Occa Vectors in the libParanumal backend.
* Typo Fix
* Vector inputs for BasisApply and QFApply; CPU backends, OCCA, and tests converted
* Implements the new version of CeedQFunctionApply_Cuda.
* Update the Cuda backend to PR174.
* Bug fix in Cuda backend.
- Replace sprintf by snprintf
- More careful use of the macro 'va_arg'
* Vector inputs for BasisApply and QFApply; CPU backends, OCCA, and tests converted
* Update MAGMA backend to vector inputs
* Modify restriction create in the cuda backend to handle memory correctly.
* Modify restriction destroy and apply of the cuda backend.
* Corrects a few typos in the cuda backend.
* Replace a CeedFree by a cudaFree...
* CeedVectorRestoreArrayRead was syncing unnecessarly data.
* CeedVectorRestoreArrayRead was syncing unnecessarly data.
* [FIX] Adds CeedVectorRestoreArray in the restriction of the cuda backend.
* Adds an error check.
* Handles indice==NULL for identity restriction.
* Adds an CeedElemRestrictionCreateBlocked_Cuda that errors.
* Adds VectorRestor in BasisApply.
* Attempt to make SetValue function.
* Adds the memState variable inside the CeedVectorCuda and uses it.
* Fix a bug that was passing the pointer instead of the address of
the pointer to CeedFree......
* Some cleaning.
* Fix a logic error in VectorGetArray.
- Now allocates an array whatever the memState is
* Fix: Basis apply checks if emode!=CEED_EVAL_WEIGHT before getting u array.
* Cleaning for PR to libCEED repo.
* Uses Setters instead of direct struct access.
* Use Getters instead of direct structure access.
* minor forgot to get ierr in after calling some functions.
* Forget to add the SetValue function in Cuda Vector...
* minor: Works even better if we give the right function to SetValue
* Fix: Set the right function for RestrictionBlocked...
* Replace some CeedChk with CeedChk_Cu
* Fix: Replace 'vec' by its length 'length'.
* Adds some CeedChk.
* Fix the Cuda_context_destroyed bug
* Adds error checking to cudaMemcpyH2D but not to D2H since it errors...
* Use Occa file approach to read Cuda QFunctions.
* Fix a few bugs
* Test a new approach to pass the qFunction fields.
* Remove typo in t400.cu and remove debugging printf.
* Append the Cuda Fields struct at the beginning of each qFunction .cu file.
* Add qFunctions for t500, t501 and t502.
* Correct cu functions for t502.
* Memcpy the ctx on the device at each Apply call.
* Checks errors in VectorSync.
* Modifies a bit the memState logic.
* Adds a Cuda implementation of Operator instead of using Ref.
* Remove some unnecessary GetArray in OperatorApply.
* Does a trick for CEED_EVAL_NONE output.
* Fix a bug in CEED_EVAL_WEIGHT.
* Applies the QFunction to all elements, not only the first one...
* A debugging commit.
* Fix: CEED_EVAL_WEIGHT use nelem in BasisApply_Cuda.
* Rewritten weight kernel.
* All C tests pass.
* Cleaning for PR.
* Remove unneeded commented code.
* Remove commented code.
* Remove the check on the pointer in RestoreArray.
* Fix a CeedFree bug.
* Fix the edata memory leak.
* Fix misuse of CeedFree.
* Allocate device memory if there is a magic context appearing due to Fortran.
* make style
* Adds cu files for petsc/bp1 mfem/bp1 and ceed/ex1.
* Remove a warning.
* Remove switch case fall-thourgh to remove warnings.
* Remive some bugs, make other bugs show up.
* Implement the Identity Restriction.
* Size correctly the restriction.
* Modify GPU restriction kernels instead of making dummy identity.
* Add cudaFree(0) before compiling to initialize the context (?!)
* Rewritten weight kernel.
* Fix typo in weight kernel.
* Fix typo in weight kernel.
* Add bp1.cu and bp3.cu for the petsc examples.
* Rewritten interp kernel for Cuda backend.
The interp kernel was not writting data in the layout that the
QFunction is expecting.
* Rewritten grad kernel for Cuda backend.
- Small fix on the interp kernel.
- The grad kernel was not writting data in the layout that the
QFunction is expecting.
* Fix the logic in interp kernel.
* Fix the shared memory size.
* Modify grad kernel to take into account the libCEED data layout.
* Add a cuda file for mfem/bp3.
* Add synchronisation to mfem bp1 and bp3.
* Fix the grad and weight kernel to have the correct data layout.
* Forgotten cu files for Fortran.
* Corrects some typos in the Cuda file for petsc/bp1.
* Add Cuda files for the new t401 test.
* Update the logic on the transfer of the qFunction ctx.
* Write petsc/bp1 in C++ instead of C.
* Minor fix: typo
* Add synchronization to petsc/bp1+bp3.
* Removes the sync on rho in petsc/bp1+bp3.
* Integrate Jeremy Thompson's remarks to the PR.
* Use CeedError instead of exit(1).
* Removes -lstdc++ and adds Ceed in front of DeviceSetValue function.
* Removes synchronization on 'u' in the Apply.
* minor
* make style
* Use the new context interface.
* Minor
* Minor.
* Minor.
* Make style using align-pointer=name
* Minor: some cleaning
* CeedQFunctionUser: write documentation
* Make NVCC compatible with new OPT compiler options
show more ...
|