| #
c532df63
|
| 16-May-2019 |
Yohann <dudouit1@llnl.gov> |
Cuda backend using shared memory (#247)
Add a GPU backend based on Cuda using shared memory.
* Draft of a shared memory backend
* New basis apply passes all tests.
* Add the possibility to
Cuda backend using shared memory (#247)
Add a GPU backend based on Cuda using shared memory.
* Draft of a shared memory backend
* New basis apply passes all tests.
* Add the possibility to treat several elements in one block of threads.
* Fix an error in 2D and 3D gradient.
* Put the cuda-shared backend in its own folder.
* Minor cleaning.
* Replace <ceed-impl.h> with <ceed-backend.h>
* make style
* Add a few CeedChk_Cu
show more ...
|
| #
486febe6
|
| 15-May-2019 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Merge pull request #224 from CEED/mem-zero
Check QFunction Output Vecs
|
| #
fc7cf9a0
|
| 18-Apr-2019 |
jeremylt <jeremy.thompson@colorado.edu> |
Set QFunction outputs undefined before apply in new memcheck backend
|
| #
a9a22b40
|
| 31-Mar-2019 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Merge pull request #230 from CEED/makefile-info-fix
Fix the output of 'make info' for the CUDA backends.
|
| #
ab9cabde
|
| 31-Mar-2019 |
Veselin Dobrev <dobrev@llnl.gov> |
Fix the output of 'make info' for the CUDA backends.
|
| #
e17541c0
|
| 28-Mar-2019 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Merge pull request #222 from CEED/blas-vs-mkl
Add MKL optional flag
|
| #
724a7164
|
| 27-Mar-2019 |
jeremylt <jeremy.thompson@colorado.edu> |
Add MKL optional flag
|
| #
2774d5cb
|
| 26-Mar-2019 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Explicit time stepping NS solver (#152)
* Squash NS example to single commit
* Update name of NS example and explicitly zero unused outputs
* rename Theta ->theta, deltaTheta -> deltatheta and
Explicit time stepping NS solver (#152)
* Squash NS example to single commit
* Update name of NS example and explicitly zero unused outputs
* rename Theta ->theta, deltaTheta -> deltatheta and make style
* Incorporate Valeria's latest changes
* Fix small bug in Advection header
* Add Valeria's latest updates from ns-working
* Update after Jed's revision
* Improve documentation
* Drop navier-stokes from allexamples
show more ...
|
| #
40938ba6
|
| 24-Mar-2019 |
Jed Brown <jed@jedbrown.org> |
Merge pull request #218 from CEED/jed/jenkins
Add Jenkinsfile
|
| #
bdb0bdbb
|
| 22-Mar-2019 |
Jed Brown <jed@jedbrown.org> |
junit.py: update test logic for skipped and intended-fail tests
|
| #
d1f7f8d3
|
| 22-Mar-2019 |
Jed Brown <jed@jedbrown.org> |
Makefile: use $(OBJDIR) rather than hard-coding "build"
|
| #
8b6584a1
|
| 16-Mar-2019 |
Jed Brown <jed@jedbrown.org> |
Merge pull request #216 from CEED/jed/junit-tests
Enable logging and reporting tests using JUnit format, which is understood by Azure and Jenkins. The Tests tab on Azure now reports test status.
|
| #
8ec9d54b
|
| 16-Mar-2019 |
Jed Brown <jed@jedbrown.org> |
Add junit target and report test results to Azure
|
| #
55ae60f9
|
| 14-Mar-2019 |
Yohann <yohann.dudouit@gmail.com> |
Simple Cuda backend using one thread per element (#195)
Thanks-to: Jeremy Thompson
* Take into account the compute capability of the GPU
* Add the cuda/reg backend and rename cuda to cuda/ref.
Simple Cuda backend using one thread per element (#195)
Thanks-to: Jeremy Thompson
* Take into account the compute capability of the GPU
* Add the cuda/reg backend and rename cuda to cuda/ref.
- cuda/reg uses a simple approach where each element is
processed by one thread. This approach is expected to be
efficient for 1D and 2D problems, but very ineficient
as soon as the kernels start to spill, which should arise
around Q1D=4 for 3D problems.
* Compilation takes into account the deviceId
* Make style
* Remove dead code in cuda qFunctions.
* Cuda-reg specialized Restriction.
* Split the Prolongation operator into Identity/not Identity.
* Remove "#pragma unroll" until further perf investigation.
* README update
* Add a description of cuda/reg.
* Add CompositeOperator msg to CUDA backends
show more ...
|
| #
84a01de5
|
| 12-Mar-2019 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Serial and Blocked AVX Backends (#198)
* Add serial AVX backend
* Style and README changes
* Simplify AVX serial tensor loop
* Minor performance improvement
* C=1 AVX scalar case
* In
Serial and Blocked AVX Backends (#198)
* Add serial AVX backend
* Style and README changes
* Simplify AVX serial tensor loop
* Minor performance improvement
* C=1 AVX scalar case
* Increase use of AVX commands for edge cases
* Prep for eventual Tensor Object
* Comment updates
* Readme update
* Update README
* Refactor to reduce code
* Increase vectorization in remainder of columns
* Vectorize column remainder on C=1 case
* Switch to static inlining for AVX tensor contract
* Tidying for merge
* make style
* Style cleanup
* Full register use for columns
* Make style
show more ...
|
| #
b99f7525
|
| 11-Mar-2019 |
Valeria Barra <39932030+valeriabarra@users.noreply.github.com> |
Merge pull request #209 from CEED/jed/astyle
Make Style Updates
|
| #
20b73d85
|
| 11-Jan-2019 |
Jed Brown <jed@jedbrown.org> |
style: use .astylerc and filter rather than exclude rules
astyle --exclude options are evidently broken and were generating confusing output. The Makefile version also missed --align-pointer=name f
style: use .astylerc and filter rather than exclude rules
astyle --exclude options are evidently broken and were generating confusing output. The Makefile version also missed --align-pointer=name from .astylerc and that led to an inconsistency.
The source changes implied by this commit have not yet been applied.
show more ...
|
| #
563f872d
|
| 06-Mar-2019 |
Valeria Barra <39932030+valeriabarra@users.noreply.github.com> |
Merge pull request #204 from CEED/mmd-flag-fix
Fix Stray File (Issue #200)
|
| #
5efe2378
|
| 06-Feb-2019 |
jeremylt <jeremy.thompson@colorado.edu> |
Improve AVX check to avoid stray file
|
| #
0a1d75a0
|
| 06-Feb-2019 |
Valeria Barra <39932030+valeriabarra@users.noreply.github.com> |
Merge pull request #206 from CEED/wording
Readability changes
|
| #
856142e1
|
| 06-Feb-2019 |
jeremylt <jeremy.thompson@colorado.edu> |
Backend naming adjustment
|
| #
4d1cd9fc
|
| 06-Feb-2019 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Add Nek to Travis (#169)
* Add test mode to Nek BP1 and BP3, improve Nek BPs
* Fix OCCA identity rst for multifield, minor NekBP1 fix
* Improve Nek run script
* Add Nek5K to prove-all
*
Add Nek to Travis (#169)
* Add test mode to Nek BP1 and BP3, improve Nek BPs
* Fix OCCA identity rst for multifield, minor NekBP1 fix
* Improve Nek run script
* Add Nek5K to prove-all
* Update travis yml for Nek5K
* Make style
* Adjust Travis yml
* Combine Nek run bash scripts
* Minor Nek script improvements
* Update to Nek 18.0 and reduce number of Nek compiler warnings
* Document required Nek5k version
* Remove stray command
* Remove extra file
* Adapt Nek for CUDA backend
* Fix Nek script string comparison
* Modify Nek script for better exit codes
* typo fix
* Modify the CU function names in nek/bp1.cu and nek/bp3.cu
* .cu file consistency
* Tidy Travis
* Tidy Travis
* Operator fixes
show more ...
|
| #
ad28045d
|
| 01-Feb-2019 |
Valeria Barra <39932030+valeriabarra@users.noreply.github.com> |
Merge pull request #203 from CEED/intel-f-fix
Intel Fortran Fix
|
| #
8980d4a7
|
| 01-Feb-2019 |
jeremylt <jeremy.thompson@colorado.edu> |
Switch tests to .f90 extension
|
| #
8166eebd
|
| 30-Jan-2019 |
Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com> |
Merge pull request #201 from CEED/makefile-varname-fix
In Makefile: replace '-' with '_' in a variable name.
|