| 8d75ea1b | 18-Apr-2019 |
jeremylt <jeremy.thompson@colorado.edu> |
Fix include statements |
| de686571 | 14-Mar-2019 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Small clang-tidy fixes (#215) |
| 52d6035f | 13-Mar-2019 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Operator Composition (#197)
* Composite Operator for cpu/self family of backends
* Remove small leak
* Improve C tests
* Add composite operator to Fortran interface and tests
* Fix Fortr
Operator Composition (#197)
* Composite Operator for cpu/self family of backends
* Remove small leak
* Improve C tests
* Add composite operator to Fortran interface and tests
* Fix Fortran test missing destroys
* Fortran test okl files, currently not used
* fix error in composite ' add' flag logic
* Switch composite op tests to f90
* Check for operator type on utility functions
* Documentation and test cleanup
* Make Style
show more ...
|
| 84a01de5 | 12-Mar-2019 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Serial and Blocked AVX Backends (#198)
* Add serial AVX backend
* Style and README changes
* Simplify AVX serial tensor loop
* Minor performance improvement
* C=1 AVX scalar case
* In
Serial and Blocked AVX Backends (#198)
* Add serial AVX backend
* Style and README changes
* Simplify AVX serial tensor loop
* Minor performance improvement
* C=1 AVX scalar case
* Increase use of AVX commands for edge cases
* Prep for eventual Tensor Object
* Comment updates
* Readme update
* Update README
* Refactor to reduce code
* Increase vectorization in remainder of columns
* Vectorize column remainder on C=1 case
* Switch to static inlining for AVX tensor contract
* Tidying for merge
* make style
* Style cleanup
* Full register use for columns
* Make style
show more ...
|
| cdf4f918 | 09-Mar-2019 |
jeremylt <jeremy.thompson@colorado.edu> |
Apply style changes |
| 856142e1 | 06-Feb-2019 |
jeremylt <jeremy.thompson@colorado.edu> |
Backend naming adjustment |
| 4d1cd9fc | 06-Feb-2019 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Add Nek to Travis (#169)
* Add test mode to Nek BP1 and BP3, improve Nek BPs
* Fix OCCA identity rst for multifield, minor NekBP1 fix
* Improve Nek run script
* Add Nek5K to prove-all
*
Add Nek to Travis (#169)
* Add test mode to Nek BP1 and BP3, improve Nek BPs
* Fix OCCA identity rst for multifield, minor NekBP1 fix
* Improve Nek run script
* Add Nek5K to prove-all
* Update travis yml for Nek5K
* Make style
* Adjust Travis yml
* Combine Nek run bash scripts
* Minor Nek script improvements
* Update to Nek 18.0 and reduce number of Nek compiler warnings
* Document required Nek5k version
* Remove stray command
* Remove extra file
* Adapt Nek for CUDA backend
* Fix Nek script string comparison
* Modify Nek script for better exit codes
* typo fix
* Modify the CU function names in nek/bp1.cu and nek/bp3.cu
* .cu file consistency
* Tidy Travis
* Tidy Travis
* Operator fixes
show more ...
|
| 8d713cf6 | 20-Dec-2018 |
jeremylt <jeremy.thompson@colorado.edu> |
Initial libXSMM backend |
| 48fffa06 | 17-Dec-2018 |
jeremylt <jeremy.thompson@colorado.edu> |
avx vectorized backend
Edge cases for AVX BasisApply
Priority adjustment to match libXSMM branch
Remove scalar/simd mix for Intel
Check for CC AVX support
AVX: proposed doc and makefile detectio
avx vectorized backend
Edge cases for AVX BasisApply
Priority adjustment to match libXSMM branch
Remove scalar/simd mix for Intel
Check for CC AVX support
AVX: proposed doc and makefile detection update
show more ...
|
| 45918e5c | 12-Dec-2018 |
jeremylt <jeremy.thompson@colorado.edu> |
Make style |
| 16c359e6 | 12-Dec-2018 |
jeremylt <jeremy.thompson@colorado.edu> |
Check state of input vectors for Blocked backend |
| 91703d3f | 12-Dec-2018 |
jeremylt <jeremy.thompson@colorado.edu> |
Improve Ref/Blocked handling of operator vectors |
| aedaa0e5 | 19-Nov-2018 |
jeremylt <jeremy.thompson@colorado.edu> |
Vector inputs for BasisApply and QFApply; CPU backends, OCCA, and tests converted |
| 1dfeef1d | 12-Dec-2018 |
jeremylt <jeremy.thompson@colorado.edu> |
Make style |
| fe2413ff | 14-Nov-2018 |
jeremylt <jeremy.thompson@colorado.edu> |
Add setters, remove impl header from CPU, OCCA backends |
| 4dccadb6 | 30-Oct-2018 |
jeremylt <jeremy.thompson@colorado.edu> |
Add lmode field to CeedOperatorSetField |
| d1bcdac9 | 23-Oct-2018 |
jeremylt <jeremy.thompson@colorado.edu> |
Add Operator/QFunction field getters |
| d863ab9b | 19-Oct-2018 |
jeremylt <jeremy.thompson@colorado.edu> |
Separate to 3 header files |
| 4ce2993f | 17-Oct-2018 |
jeremylt <jeremy.thompson@colorado.edu> |
First round of getters
Use Getters in ref backend
Add Getters to blocked backend
Convert OCCA backend to use Getters
Add getters for backend data |
| 43547ca1 | 26-Sep-2018 |
jeremylt <jeremy.thompson@colorado.edu> |
Add non-tensor bases to /cpu/self/blocked (and thus tmpl) |
| e6e9a80e | 29-Aug-2018 |
jeremylt <jeremy.thompson@colorado.edu> |
Restore ceed argument in TensorContractRef/Opt |
| 5fe0d4fa | 29-Aug-2018 |
jeremylt <jeremy.thompson@colorado.edu> |
Switch held ref ceed to delegate ceed recursively checked for |
| 667bc5fc | 29-Aug-2018 |
jeremylt <jeremy.thompson@colorado.edu> |
Refactor to standardize backend create functions |
| 4a2e7687 | 04-Sep-2018 |
jeremylt <jeremy.thompson@colorado.edu> |
Rename /cpu/self/opt to /cpu/self/blocked |