History log of /libCEED/README.md (Results 101 – 125 of 158)
Revision Date Author Comments
# 821dffb6 24-Mar-2019 Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com>

README typo fix


# 55ae60f9 14-Mar-2019 Yohann <yohann.dudouit@gmail.com>

Simple Cuda backend using one thread per element (#195)

Thanks-to: Jeremy Thompson

* Take into account the compute capability of the GPU

* Add the cuda/reg backend and rename cuda to cuda/ref.

Simple Cuda backend using one thread per element (#195)

Thanks-to: Jeremy Thompson

* Take into account the compute capability of the GPU

* Add the cuda/reg backend and rename cuda to cuda/ref.

- cuda/reg uses a simple approach where each element is
processed by one thread. This approach is expected to be
efficient for 1D and 2D problems, but very ineficient
as soon as the kernels start to spill, which should arise
around Q1D=4 for 3D problems.

* Compilation takes into account the deviceId

* Make style

* Remove dead code in cuda qFunctions.

* Cuda-reg specialized Restriction.

* Split the Prolongation operator into Identity/not Identity.

* Remove "#pragma unroll" until further perf investigation.

* README update

* Add a description of cuda/reg.

* Add CompositeOperator msg to CUDA backends

show more ...


# 84a01de5 12-Mar-2019 Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com>

Serial and Blocked AVX Backends (#198)

* Add serial AVX backend

* Style and README changes

* Simplify AVX serial tensor loop

* Minor performance improvement

* C=1 AVX scalar case

* In

Serial and Blocked AVX Backends (#198)

* Add serial AVX backend

* Style and README changes

* Simplify AVX serial tensor loop

* Minor performance improvement

* C=1 AVX scalar case

* Increase use of AVX commands for edge cases

* Prep for eventual Tensor Object

* Comment updates

* Readme update

* Update README

* Refactor to reduce code

* Increase vectorization in remainder of columns

* Vectorize column remainder on C=1 case

* Switch to static inlining for AVX tensor contract

* Tidying for merge

* make style

* Style cleanup

* Full register use for columns

* Make style

show more ...


# 0a1d75a0 06-Feb-2019 Valeria Barra <39932030+valeriabarra@users.noreply.github.com>

Merge pull request #206 from CEED/wording

Readability changes


# 6b75b9c5 06-Feb-2019 Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com>

spelling


# 293f4b1a 06-Feb-2019 Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com>

Update README.md


# 29d6e734 06-Feb-2019 Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com>

Readme update


# 4d1cd9fc 06-Feb-2019 Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com>

Add Nek to Travis (#169)

* Add test mode to Nek BP1 and BP3, improve Nek BPs

* Fix OCCA identity rst for multifield, minor NekBP1 fix

* Improve Nek run script

* Add Nek5K to prove-all

*

Add Nek to Travis (#169)

* Add test mode to Nek BP1 and BP3, improve Nek BPs

* Fix OCCA identity rst for multifield, minor NekBP1 fix

* Improve Nek run script

* Add Nek5K to prove-all

* Update travis yml for Nek5K

* Make style

* Adjust Travis yml

* Combine Nek run bash scripts

* Minor Nek script improvements

* Update to Nek 18.0 and reduce number of Nek compiler warnings

* Document required Nek5k version

* Remove stray command

* Remove extra file

* Adapt Nek for CUDA backend

* Fix Nek script string comparison

* Modify Nek script for better exit codes

* typo fix

* Modify the CU function names in nek/bp1.cu and nek/bp3.cu

* .cu file consistency

* Tidy Travis

* Tidy Travis

* Operator fixes

show more ...


# 0f918338 30-Jan-2019 Valeria Barra <39932030+valeriabarra@users.noreply.github.com>

Merge pull request #202 from CEED/XSMM-fix

Fix LIBXSMM capitalization


# a0ecefdd 30-Jan-2019 jeremylt <jeremy.thompson@colorado.edu>

Fix LIBXSMM capitalization


# 2f4d9adb 26-Jan-2019 Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com>

Benchmarking (#187)

* Add make benchmarks

* Various tweaks related to the benchmarks.

* In Makefile:
* target 'all' now builds the library, all tests and examples
* the old 'all' target is n

Benchmarking (#187)

* Add make benchmarks

* Various tweaks related to the benchmarks.

* In Makefile:
* target 'all' now builds the library, all tests and examples
* the old 'all' target is now called 'par'
* the target 'examples' will build also the MFEM and PETSc examples if
the respective library is available.

In the benchmarks/ directory:
* remove 'config.sh'
* cleanup unused stuff from 'benchmark.sh'.

* Fix postprocess scripts, convert to Python 3

* Small update in README.md

* Set benchmark cg its max, update gitignore

* Minor makefile fix

* In Makefile, add 'par' to the list of phony targets.

* In benchmarks/postprocess-table.py, sort the table by backend first.

* Small update in examples/petsc/Makefile - add a comment that
PETSC_ARCH can be undefined/empty, e.g. when using PETSc installed
through Spack.

* In Makefile, update the benchmarking targets:
* add separate targets for individual tests: `bench-petsc-bp1`,
`bench-petsc-bp3`, etc
* `make benchmarks` runs all defined benchmarks.

Update README.md to reflect the above changes.

show more ...


# f6a4878d 23-Jan-2019 Jed Brown <jed@jedbrown.org>

Merge pull request #186 from CEED/libxsmm

Initial libXSMM Backend


# 8d713cf6 20-Dec-2018 jeremylt <jeremy.thompson@colorado.edu>

Initial libXSMM backend


# ae228676 11-Jan-2019 Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com>

Merge pull request #182 from CEED/avx

AVX Backend


# 48fffa06 17-Dec-2018 jeremylt <jeremy.thompson@colorado.edu>

avx vectorized backend

Edge cases for AVX BasisApply

Priority adjustment to match libXSMM branch

Remove scalar/simd mix for Intel

Check for CC AVX support

AVX: proposed doc and makefile detectio

avx vectorized backend

Edge cases for AVX BasisApply

Priority adjustment to match libXSMM branch

Remove scalar/simd mix for Intel

Check for CC AVX support

AVX: proposed doc and makefile detection update

show more ...


# dba52a49 04-Sep-2018 Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com>

Merge pull request #147 from CEED/opt-to-vec

Rename /cpu/self/opt to /cpu/self/blocked


# 4a2e7687 04-Sep-2018 jeremylt <jeremy.thompson@colorado.edu>

Rename /cpu/self/opt to /cpu/self/blocked


# f82d2baa 15-Aug-2018 Jed Brown <jed@jedbrown.org>

Merge branch 'cleanup' [PR #118]

* cleanup:
make style, excluding backends/{occa,magma}
make style: fix interface and include paths
docs: fix capitalization
doc: add developer notes on shape

Merge branch 'cleanup' [PR #118]

* cleanup:
make style, excluding backends/{occa,magma}
make style: fix interface and include paths
docs: fix capitalization
doc: add developer notes on shape and adopt convention
Standardize CeedIntPow and CeedIntMin
Move and document CeedIntMin, document CeedPowInt
Add function levels
Update Doxygen output naming
Add Test List to Doxygen
Doxygen interface comment updates
Remove redundant doxygen comments
Documentation updating for t500
Move ceed* files to 'inteface' directory, comment cleanup
Further CPU backend commenting and cleaning
Reorder tests, renumber for future expansion
Clean up and tighten Opt and Ref backends

show more ...


# dfdf5a53 12-Aug-2018 jeremylt <jeremy.thompson@colorado.edu>

Add function levels


# 9ddbf157 09-Aug-2018 jeremylt <jeremy.thompson@colorado.edu>

Documentation updating for t500


# 583a6f96 07-Aug-2018 Jed Brown <jed@jedbrown.org>

Add coverage badge


# b1b1662c 01-Aug-2018 Jed Brown <jed@jedbrown.org>

Merge branch 'jed/makefile-optflags' [PR #102]

* jed/makefile-optflags:
Makefile: use LINK.* for clearer output/less duplication
Makefile: add OPT for all-language opt/dbg flags


# 323c739c 24-Jul-2018 Jed Brown <jed@jedbrown.org>

Makefile: add OPT for all-language opt/dbg flags


# 389b3d93 19-Jul-2018 Jed Brown <jed@jedbrown.org>

Merge branch 'jed/active-passive' [PR #41]

* jed/active-passive: (58 commits)
Remove spurious comments
Make style
[PETSc] Modify Makefile for abspath for .okl
[OCCA] PETSc bp1 works, but .ok

Merge branch 'jed/active-passive' [PR #41]

* jed/active-passive: (58 commits)
Remove spurious comments
Make style
[PETSc] Modify Makefile for abspath for .okl
[OCCA] PETSc bp1 works, but .okl error in prove-all
[OCCA] Fix qfunction not shifting output pointers
[OCCA] Replacing series of 'if's with switch
Modify Makefile to include ceed.pc for prove-all
Fix error in Makefile checking for MFEM_DIR
Update README.md
Update Tmpl to use highest priority /cpu/self
[OCCA] Rework switch statement for AllocOpOut and AllocOpIn
PETSc bp1: update okl kernels and extract ComputeErrorMax
Add CeedVectorGetLength
Occa: sync to host for passive fields
PETSc bp1: compute collocated error vector instead of reducing in kernel
Occa: copy OperatorApply output to "used" pointer
Add check for MFEM_DIR to Makefile
[OCCA]Add zeroing of outvecs
Further work on Nek5000 BPs, added error checking to OpApply
[NEK][WIP] Modifying BPs
...

show more ...


# 97a942b6 10-Jul-2018 Jeremy L Thompson <25011573+jeremylt@users.noreply.github.com>

Update README.md


1234567