| #
9e201c85
|
| 23-Sep-2022 |
Yohann <dudouit1@llnl.gov> |
Refactor `cuda-gen` and `hip-gen` backends. (#1050)
* Add TODO items.
* rough, but something like this?
* wip - cleaning up some warnings, but more remain
* wip - reorganize
* wip - miss
Refactor `cuda-gen` and `hip-gen` backends. (#1050)
* Add TODO items.
* rough, but something like this?
* wip - cleaning up some warnings, but more remain
* wip - reorganize
* wip - missing kernels
* wip - replace t1d
* fix some kernels
* another typo
* more
* another one
* closer
* define T_1D
* typosgit add .!
* WIP: changes to cuda-shared framework for new kernels
* fix output writing
* buffer fix
* buffer sizes
* WIP: fixes for 2 and 3D basis kernels
* minor
* fix weight kernel for 3d
* remove debugging output
* minor reorg
* fix includes
* enable collo grad for cuda-shared
* move quoted kernels
* renaming
* missed a rename
* small fix
* more naming consistency
* faster 'useCollograd=false' path in *-gen
* more style
* one last style fix
* clearer collograd condition
* Add gen basis kernels to hip-shared
* Try some changes to hip-shared basis block sizes for new kernels
* cuda - drop extra kernel arg
* cuda - fix collograd check logic
* update gen comment about parallelization
* tidy up fields struct definition
* tidy up structs even more
* Update hip-gen basis templates use and move other hip-gen device functions to jit-source
* Finish hip-gen basis template update; small style updates to match CUDA
* missing isStrided
* Update block size used in 3D weight for new shared kernels
* update release notes
Co-authored-by: Jeremy L Thompson <jeremy@jeremylt.org>
Co-authored-by: nbeams <246972+nbeams@users.noreply.github.com>
show more ...
|
| #
dc64899e
|
| 06-Sep-2022 |
Yohann <dudouit1@llnl.gov> |
Change the initialization logic for `useCollograd`. (#1021)
* Change the initialization logic for `useCollograd`.
* Guard useCollograd for 3D only.
* Propagate `useCollograd` change to `hip-ge
Change the initialization logic for `useCollograd`. (#1021)
* Change the initialization logic for `useCollograd`.
* Guard useCollograd for 3D only.
* Propagate `useCollograd` change to `hip-gen`.
* Update backends/hip-gen/ceed-hip-gen-operator-build.cpp
* Propagate changes to `hig-gen`.
* Revert redimensioning of `r_tt`.
Co-authored-by: Jed Brown <jed@jedbrown.org>
show more ...
|
| #
c9c2c079
|
| 05-Aug-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
QF headers for typedefs and macros (#1036)
* jit - qf headers for typedefs and macros
* jit - smaller list of permitted files
* ceed - only include ceed.h in QF source
|
| #
e8001fe0
|
| 07-Jul-2022 |
Jed Brown <jed@jedbrown.org> |
Merge pull request #1009 from CEED/jrwrigh/dirichlet_with_libceed
Fluids - Use libCEED to compute Dirichlet boundary conditions
|
| #
3b0d37b7
|
| 07-Jul-2022 |
Jed Brown <jed@jedbrown.org> |
{cuda,hip}/gen: fix incorrect quadrature points when all bases are collocated
https://github.com/CEED/libCEED/pull/1009#issuecomment-1176751436
Co-authored-by: Natalie Beams <nbeams@icl.utk.edu>
|
| #
c9d492da
|
| 23-Jun-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #1006 from CEED/jeremy/gen-all-collo-fix
Fix /gpu/*/gen backends for op with all CEED_BASIS_COLLOCATED
|
| #
1d47fde2
|
| 22-Jun-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - fix /gpu/*/gen backends for op with all CEED_BASIS_COLLOCATED
|
| #
3c4b7af6
|
| 23-May-2022 |
Jed Brown <jed@jedbrown.org> |
Merge branch 'main' into jed/fluids-jacobian
* main: Fluids - Add STG inflow (#868) ci - fix Nek5000 testing
|
| #
ba6664ae
|
| 22-May-2022 |
James Wright <james@jameswright.xyz> |
Fluids - Add STG inflow (#868)
* doc(fluids): Add STG equations
* doc(fluids): Add basic data flow for STG
* doc(fluids): Add Shur et al. 2014 STG paper to bib
* doc(fluids): Specify STG in
Fluids - Add STG inflow (#868)
* doc(fluids): Add STG equations
* doc(fluids): Add basic data flow for STG
* doc(fluids): Add Shur et al. 2014 STG paper to bib
* doc(fluids): Specify STG inputs files, misc additions
* doc(fluids): Add intro for STG section
* fix(fluids): Add #include ceed.h for qfunctions
- In the spirit of "include what you use"
* feat(fluids): Start work on stg_shur14.h
* doc-fix: Correct kappa_min definition
* Move STG setup functions to problems/stg_shur14
* feat: Add cholesky decomposition function
* fix: Correct stg_ctx malloc, reorganize creation
Co-authored-by: Jed Brown <jed@jedbrown.org>
* fix(fluids): Correct return values of functions
* style: Fix up style
* feat(fluids): Get file paths from PetscOptions
- Also convert SetupSTGContext to return PetscErrorCode
* fix(fluids): Correct stg_ctx dereferencing
- Also move to size_t for type of the offsets
* feat(fluids): Add funcs for processing STG*.dat files
* feat(fluids): Move to PetscOptions* for STG flags
* feat: Use `PetscMax` instead of macro
* fix(fluids): Correct/Refactor file reading functions
- Move to `PetscSynchronizedFGets`
- Remove `inline`
- Pass `comm` between functions
- Add `OpenPHASTADatFile` to DRY
* docs(fluids): Fix equation typo
* fix(fluids): Correct calculation of kappa
* feat(fluids): Complete STGShur14_Calc
* feat(fluids): Add InterpolateProfile helper func
* feat(fluids): Add CalcSpectrum helper func
* feat(fluids): Add to STGShur14_Calc qfunction
* fix: Add M_PI, Update SETERRQ functions
- Also update style
* fix: Correct interpolation outside of datarange
* fix: Add missing definition for ke in CalcSpectrum
* feat: Migrate context and func signatures, Misc
- Create SetupSTGContext to be run in another Setup_____Context
function
- Migrate STGShur14Context, CreateSTGContext, and SetupSTGContext
signatures to navierstokes.h
- Add STG contexts to Physics and CeedData
- Add missing CHKERRQ to PetscFClose
- Move to SPDX license headers
* examples/fluids: Pass solution time via context label
* feat: Implement STG boundary integral
- Add theta0 and implicit members to STGShur14Context
- Tested via implementation to the blasius BL problem (though this will
probably go against the code history)
* feat: Fix STG Stuff
* feat: Implement STG inflow for blasius BL
- Note that fluctuations are turned off in this case
* examples/fluids: Add stg_mean_only flag
* examples/fluids: Check cholesky decomp for nans
- Also correct locaiton of cholesky decomposition in ReadSTGInflow()
* examples/fluids: Correct STG documentation
- Missing a 2 sqrt(3/2) factor and didn't take square root of q
* examples/fluids: Fix STGShur14_Calc
- Given the calculated spectrum, calculation of v' and u' verified
against python implementation (which was validated previously against
PHASTA)
* examples/fluids: Calc dXdx for boundary QFunctions
- Also calculate h from the dXdx in STGShur14_Inflow
- Replace h[0] result with constant dx spacing
* examples/fluids: Fix STG Spectra calcualtion
* examples/fluids: Fix build errors
- Ran into an include cycle collision that resulted in over-defining
SetupContext in advection.c
- newtonian_types.h (which has SetupContext defined) ->
stg_shur14_type.h -> navierstokes.h -> advection.c
* examples/fluids: Update and fix documentation
* examples/fluids: Correct dXdx comment, leave TODO
Co-authored-by: Jed Brown <jed@jedbrown.org>
* examples/fluids: Minor bib citation edits
Co-authored-by: Jed Brown <jed@jedbrown.org>
* examples/fluids: Add STGInflow.dat, fix blasius.yaml
* examples/fluids: int -> PetscInt | CeedInt
* examples/fluids: Style
* examples/fluids: Make Boolean names verb_noun format
- Also changes the stg flag to `-stg_use`
* examples/fluids: Add STG test
* examples/fluids: Style fix up
* examples/fluids: Update docs
* examples/fluids: Implement weakT option for STG
* examples/fluids: Fix casting for ROCm
* examples/fluids: avoid PETSc dependency in qfunctions
* examples/fluids: header cleanup
* backends/hip: avoid redundant inline
* examples/fluids: avoid VLA in qfunctions
GPUs don't like VLA and some compilers reject it when targeting GPUs.
* examples/fluids: Create STG_NMODES_MAX
* examples/fluids: Refactor stg setup out of blasius.c
* examples/fluids: Fix misc GPU bugs
Co-authored-by: Jed Brown <jed@jedbrown.org>
show more ...
|
| #
4345bdd5
|
| 20-Mar-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #920 from CEED/jeremy/conversion
Explicit casting of vector sizes
|
| #
539ec17d
|
| 20-Mar-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - fix error handling in size conversion
|
| #
ce18bed9
|
| 17-Mar-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #858 from CEED/jeremy/dump-copy-stuff
Strip redundant/outdated license info duplication
|
| #
3d8e8822
|
| 17-Mar-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
minor - update copyright headers
|
| #
60224bc5
|
| 14-Mar-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #913 from CEED/jeremy/coo-ptrdiff
Create CeedSize as ptrdiff_t
|
| #
e79b91d9
|
| 11-Mar-2022 |
Jeremy L Thompson <jeremy@jeremylt.org> |
rstr - use CeedSize for l_size
|
| #
f99981a3
|
| 25-Feb-2022 |
Jed Brown <jed@jedbrown.org> |
Merge pull request #893 from CEED/natalie/more-hip-launch-bounds
HIP: add atomics flag and more kernel launch bounds for performance improvements
|
| #
37c3b1cf
|
| 24-Feb-2022 |
nbeams <246972+nbeams@users.noreply.github.com> |
Change to more specific name for hip-gen block size function
|
| #
b3e1519b
|
| 31-Jan-2022 |
nbeams <246972+nbeams@users.noreply.github.com> |
Add launch bounds to hip-gen operator kernel
|
| #
51d630a3
|
| 24-Dec-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #864 from CEED/jeremy/gpu-templates
GPU - pull quoted kernels into separate files
|
| #
46dc0734
|
| 23-Dec-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - improved human-readability of debugging output
|
| #
437930d1
|
| 22-Dec-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - pull quoted kernels into separate files
|
| #
d92fedf5
|
| 22-Dec-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #863 from CEED/jeremy/gpu-jit-code
GPU - separate common code into separate folder
|
| #
0d0321e0
|
| 22-Dec-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
style - consistent nameing and style for gpu backends
|
| #
7fcac036
|
| 22-Dec-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
gpu - split common cuda/hip data into separate folder
|
| #
d0dee30e
|
| 19-Nov-2021 |
Jeremy L Thompson <jeremy@jeremylt.org> |
Merge pull request #840 from CEED/jeremy/env-debug
Macro for Debug without Ceed Context
|