| #
6ca0f394
|
| 20-Jul-2023 |
Umesh Unnikrishnan <umesh.aero@gatech.edu> |
Add sycl/gen backend and other sycl changes (#1258)
---------
Co-authored-by: Kris Rowe <kris.rowe@anl.gov>
Co-authored-by: Kris Rowe <krowe@anl.gov>
Co-authored-by: Jed Brown <jed@jedbrown.org
Add sycl/gen backend and other sycl changes (#1258)
---------
Co-authored-by: Kris Rowe <kris.rowe@anl.gov>
Co-authored-by: Kris Rowe <krowe@anl.gov>
Co-authored-by: Jed Brown <jed@jedbrown.org>
Co-authored-by: Varsha Madananth <vmadananth@uan-0002.head.cm.americas.sgi.com>
Co-authored-by: James Wright <jrwrigh.iii@gmail.com>
show more ...
|
| #
4548da4e
|
| 12-Jul-2023 |
Sebastian Grimberg <sebastiangrimb@gmail.com> |
Update LIBXSMM backend (#1248)
* Fix LIBXSMM kernel generation calls after 9c0e481 in https://github.com/libxsmm/libxsmm
* Update LIBXSMM interface to work with main branch after commit 1f4cdad (
Update LIBXSMM backend (#1248)
* Fix LIBXSMM kernel generation calls after 9c0e481 in https://github.com/libxsmm/libxsmm
* Update LIBXSMM interface to work with main branch after commit 1f4cdad (in preparation for v2)
* Allow user specified BLAS_LIB for LIBXSMM dependency in Makefile
* LIBXSMM does not require kernels to be released
See https://github.com/libxsmm/libxsmm/issues/783\#issuecomment-1596655284.
* Improvements for non-tensor CPU-based CeedBasisApply for q_comp > 1
* Revert previous commit since it's faster to apply in P*Q panels, remove an unncessary LIBXSMM kernel compilation
* Remove an unused macro
* make format
* Rely on LIBXSMM to cache JIT'd kernels
* LIBXSMM dispatched kernels for xsmm/serial backend
* Combine ceed-xsmm-tensor-fp64 and -fp32 into single file for all precisions
* Address PR comments
* Update GitLab CI LIBXSMM version
show more ...
|
| #
7bd8c1e1
|
| 30-Jun-2023 |
Jed Brown <jed@jedbrown.org> |
Merge pull request #1243 from CEED/jed/fix-asan-bugs
Fix -fsanitize=address bugs
|
| #
fb651866
|
| 28-Jun-2023 |
Jed Brown <jed@jedbrown.org> |
CI: upgrade actions to gcc-12 and enable address sanitizer
|
| #
bd882c8a
|
| 15-Jun-2023 |
James Wright <james@jameswright.xyz> |
Add sycl/ref and sycl/shared backends (#1229)
* Merge sycl_backend from ALCF fork
---------
Co-authored-by: Umesh Unnikrishnan <unnikrishnan@anl.gov>
Co-authored-by: Kris Rowe <krowe@anl.gov>
Add sycl/ref and sycl/shared backends (#1229)
* Merge sycl_backend from ALCF fork
---------
Co-authored-by: Umesh Unnikrishnan <unnikrishnan@anl.gov>
Co-authored-by: Kris Rowe <krowe@anl.gov>
Co-authored-by: Jeremy L Thompson <jeremy@jeremylt.org>
show more ...
|
| #
54624e1f
|
| 14-Jun-2023 |
Jed Brown <jed@jedbrown.org> |
Merge pull request #1227 from CEED/jed/fix-hip-cppflags
make: pass CPPFLAGS to HIPCC
|
| #
f150ce52
|
| 13-Jun-2023 |
Jed Brown <jed@jedbrown.org> |
make: pass CPPFLAGS to HIPCC
Otherwise users setting HIPCCFLAGS overrides the necessary directories that must appear in -I parameters.
|
| #
c0c8b919
|
| 01-May-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #1207 from CEED/jeremy/minor-make
Fix make info alignment
|
| #
4b4ca4dc
|
| 01-May-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
minor -fix make info alignment
|
| #
c19f45c9
|
| 25-Apr-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #1204 from CEED/jeremy/make-fix
Minor Makefile consistency fix
|
| #
5120064b
|
| 21-Apr-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
make - minor consistency fix
|
| #
2c2ea1db
|
| 15-Apr-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Pedantic make option (#1193)
* make - add PEDANTIC option
* ci - use PEDANTIC make option
* cuda - fix pedantic error
* pedantic - drop nonstandard ternary for CeedError
|
| #
49aac155
|
| 24-Mar-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
IWYU fixes (#1182)
* iwyu - include fixes
* iwyu - silence some iwyu output
* minor - clearer macro names
* iwyu - fix suggestion of "ceed/ceed.h" externally
* iwyu - lighter petsc heade
IWYU fixes (#1182)
* iwyu - include fixes
* iwyu - silence some iwyu output
* minor - clearer macro names
* iwyu - fix suggestion of "ceed/ceed.h" externally
* iwyu - lighter petsc headers
* iwyu - ceed/ceed.h -> ceed.h
* iwyu - cuda/hip include fixes
show more ...
|
| #
131837e7
|
| 14-Mar-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #1172 from CEED/jeremy/more-tests
Spelling and Ceed example consistency
|
| #
6ed4cbd1
|
| 06-Mar-2023 |
Jeremy L Thompson <jeremy@jeremylt.org> |
minor - spelling fix
|
| #
023b8a51
|
| 25-Jan-2023 |
abdelfattah83 <36712794+abdelfattah83@users.noreply.github.com> |
magma: non-tensor rtc (#1141)
* some refactoring in magma's jit src
* fix path
* fix loading src
* refactor magma nontensor backend
* refactor magma nontensor backend
* [WIP]: new non
magma: non-tensor rtc (#1141)
* some refactoring in magma's jit src
* fix path
* fix loading src
* refactor magma nontensor backend
* refactor magma nontensor backend
* [WIP]: new nontensor basis kernels
* [WIP]: new nontensor basis kernels
* [WIP]: new nontensor basis kernels
* call the new nontensor kernels for low order problems
* multiple compilation for the same kernels but with different tuning parmaters
* magma: allow different nb's for different non-tensor kernels
* tuning data for the non-tensor rtc kernels
* remove no-longer used functions, add new one for tuning the nontensor kernels
* constants for tuning
* tuning functions
* use the tuning functions in compiling/running the new kernels
* bug fix
* fixes
* fixes
* minor
* switch tuning data
* fix name
* fix name
* add function to run cuda kernels with opt-in shared memory feature
* minor fix
* minor fix
* fix calls to batch api
* allow more kernel instances
* temporary timing function
* temporary timing function
* tuning data based on hiprtc
* rollback tuning parameters
* fixes
* fixes
* fix inconsistency in the parameters passed to nvrtc/hiprtc
* minor
* a fix to the nb selector
* cleanup
* merge the opt-in feature in CeedRunKernelDimSharedOptinCuda into CeedRunKernelDimSharedCuda
* fix paths for hip-magma backends
* style
* fixes
* running make format
* undo changes from the last commit
* change HIP_DIR to ROCM_DIR and adjust the paths for magma accordingly
* replace HIP_DIR with ROCM_DIR
show more ...
|
| #
b46939db
|
| 06-Dec-2022 |
Jed Brown <jed@jedbrown.org> |
Merge pull request #1108 from CEED/jed/fix-makefile-null.d
Makefile: avoid creating null.d file (fix #1107)
|
| #
59e56409
|
| 06-Dec-2022 |
Jed Brown <jed@jedbrown.org> |
Makefile: avoid creating null.d file (fix #1107)
The feature checks in the makefile used gcc -MMD -x c /dev/null, which generates a dependency file. We need to use $(CFLAGS) because that's where tar
Makefile: avoid creating null.d file (fix #1107)
The feature checks in the makefile used gcc -MMD -x c /dev/null, which generates a dependency file. We need to use $(CFLAGS) because that's where target/optimization options go, so filter out the -M arguments, which relate to generating dependencies.
Reported-by: Veselin Dobrev
show more ...
|
| #
2b730f8b
|
| 17-Nov-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Switch to clang-format (#1051)
* style - switch to clang-format
* ci - use newer libxsmm
* action - update format action
* format - consistent use of {} for multi-line if/for
* make - re
Switch to clang-format (#1051)
* style - switch to clang-format
* ci - use newer libxsmm
* action - update format action
* format - consistent use of {} for multi-line if/for
* make - remove stray newline
* make - simpler 'make format' target
* ci - use newer libxsmm
* doc - minor release note claification
* minor - minor fix
* minor - minor fix
* minor - minor fix
* minor - minor fix
* make format
* format - less aggressive alignment rules
* tidy - check for argument name mismatches
* fix newline
* format - mirror Ratel update to .clang-format
* fix merge error
* fix merge conflict
* fix merge error
* drop style in .phony list
* Update .clang-format
Co-authored-by: Jed Brown <jed@jedbrown.org>
* apply updated format
Co-authored-by: Jed Brown <jed@jedbrown.org>
show more ...
|
| #
832499a0
|
| 20-Oct-2022 |
Jed Brown <jed@jedbrown.org> |
Merge pull request #1076 from CEED/jed/emscripten-usability
Emscripten usability and documentation
|
| #
0bd9ac94
|
| 20-Oct-2022 |
Jed Brown <jed@jedbrown.org> |
Makefile: improve emscripten usability and auto-detection
For stand-alone use, one can do
$ emmake make build/ex2-surface.wasm [...] $ wasmer build/ex2-surface.wasm -- -s 999999
The executab
Makefile: improve emscripten usability and auto-detection
For stand-alone use, one can do
$ emmake make build/ex2-surface.wasm [...] $ wasmer build/ex2-surface.wasm -- -s 999999
The executable in this mode is entirely static with a specified maximum memory.
show more ...
|
| #
0be03a92
|
| 13-Oct-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Update OCCA Backend (#1072)
* Update OCCA memory interop call.
* Removes deprecated kernelBuilder calls; builds directly from the device.
* Uses `std::memset` and `std:memcpy`.
* Uses `std:
Update OCCA Backend (#1072)
* Update OCCA memory interop call.
* Removes deprecated kernelBuilder calls; builds directly from the device.
* Uses `std::memset` and `std:memcpy`.
* Uses `std::to_string` instead of internal `occa::toString`.
* Uses `std::memcpy`.
* Sets kernel properties.
* Removes deprecated kernelBuilder calls; builds directly from the device.
* Removes deprecated kernelBuild call; builds directly from the device.
* Uses `std::memcpy`.
* Removes deprecated calls to `occa::linalg`.
* Add registration and device configuration for DPC++ and OpenCL backends.
* Add DPC++ and OpenCL backends to makefile.
* Spelling.
* Configure build flags for oneAPI compilers.
* Correctly set mode in `occa::json` object.
* Add missing functions to OCCA CeedVector implementation.
* Adds missing call to `setValueKernel` in OCCA CeedVector impl.
* Gets occa device function name from the CeedQFunction.
* Adds OpenCL and DPC++ to backends list.
* Uses unique kernel name for OCCA qFunction kernels.
* Adds a dummy `ceed.h` header in include in OCCA kernels.
* Rewrite arrays of structs in format that OCCA can handle.
* Adds stubs for missing functions in OCCA qfunctioncontext.
* Includes the cmath header when compiling C++ code.
* Add stubs for missing OCCA backend LinearAssembleXXX functions.
* Adds missing functions to OCCA implemenation of qFunctionContext.
* Removes math function headers which were causing OCCA JIT failures.
* Rewrite arrays of structs in format that OCCA can handle.
* Rewrites fluids example qfunctions to be compatible with OCCA.
* Fixes array dimensions in mass2dbuild.
* Rewrites advection problem kernels to work with OCCA.
* Rewrites blasius problem kernels to work with OCCA.
* Rewrites channel problem kernels to work with OCCA.
* Rewrites dirichlet bc kernels to work with OCCA.
* Rewrites newtonian kernels to work with OCCA.
* Rewrites setupgeo kernels to be compatible with OCCA.
* Rewrites stabilization kernels to be compatible with OCCA.
* Rewrites stg kernels to be compatible with OCCA.
* Adds occa backends to tests for the fluids example.
* doc - update OCCA info in release notes + README
* ci - run with OCCA v1.4
* occa - update copyright boilerplate
* occa - drop unused define
* ci - fix OCCA install
* wip
* ci - fix occa skip list
* make/ci - fix use of OCCA_DIR/bin/occa
* makefile - minor style
* ci - re-enable OCCA dir caching
* doc - update release notes to mention OCCA QF workarounds
Co-authored-by: Kris Rowe <kris.rowe@anl.gov>
show more ...
|
| #
3c60848e
|
| 22-Sep-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #1070 from CEED/rezgar/tap-output
removed tap.sh and updated junit.py
|
| #
3d94f746
|
| 21-Sep-2022 |
rezgarshakeri <rezgar.shakeri@colorado.edu> |
removed tap.sh and updated junit.py
|
| #
417419bc
|
| 15-Sep-2022 |
Jed Brown <jed@jedbrown.org> |
Merge pull request #1040 from CEED/jed/emscripten-build
emscripten build
|