History log of /libCEED/backends/hip-gen/ceed-hip-gen-operator-build.cpp (Results 101 – 125 of 144)
Revision Date Author Comments
# 9e201c85 23-Sep-2022 Yohann <dudouit1@llnl.gov>

Refactor `cuda-gen` and `hip-gen` backends. (#1050)

* Add TODO items.

* rough, but something like this?

* wip - cleaning up some warnings, but more remain

* wip - reorganize

* wip - miss

Refactor `cuda-gen` and `hip-gen` backends. (#1050)

* Add TODO items.

* rough, but something like this?

* wip - cleaning up some warnings, but more remain

* wip - reorganize

* wip - missing kernels

* wip - replace t1d

* fix some kernels

* another typo

* more

* another one

* closer

* define T_1D

* typosgit add .!

* WIP: changes to cuda-shared framework for new kernels

* fix output writing

* buffer fix

* buffer sizes

* WIP: fixes for 2 and 3D basis kernels

* minor

* fix weight kernel for 3d

* remove debugging output

* minor reorg

* fix includes

* enable collo grad for cuda-shared

* move quoted kernels

* renaming

* missed a rename

* small fix

* more naming consistency

* faster 'useCollograd=false' path in *-gen

* more style

* one last style fix

* clearer collograd condition

* Add gen basis kernels to hip-shared

* Try some changes to hip-shared basis block sizes for new kernels

* cuda - drop extra kernel arg

* cuda - fix collograd check logic

* update gen comment about parallelization

* tidy up fields struct definition

* tidy up structs even more

* Update hip-gen basis templates use and move other hip-gen device functions to jit-source

* Finish hip-gen basis template update; small style updates to match CUDA

* missing isStrided

* Update block size used in 3D weight for new shared kernels

* update release notes

Co-authored-by: Jeremy L Thompson <jeremy@jeremylt.org>
Co-authored-by: nbeams <246972+nbeams@users.noreply.github.com>

show more ...


# dc64899e 06-Sep-2022 Yohann <dudouit1@llnl.gov>

Change the initialization logic for `useCollograd`. (#1021)

* Change the initialization logic for `useCollograd`.

* Guard useCollograd for 3D only.

* Propagate `useCollograd` change to `hip-ge

Change the initialization logic for `useCollograd`. (#1021)

* Change the initialization logic for `useCollograd`.

* Guard useCollograd for 3D only.

* Propagate `useCollograd` change to `hip-gen`.

* Update backends/hip-gen/ceed-hip-gen-operator-build.cpp

* Propagate changes to `hig-gen`.

* Revert redimensioning of `r_tt`.

Co-authored-by: Jed Brown <jed@jedbrown.org>

show more ...


# c9c2c079 05-Aug-2022 Jeremy L Thompson <jeremy@jeremylt.org>

QF headers for typedefs and macros (#1036)

* jit - qf headers for typedefs and macros

* jit - smaller list of permitted files

* ceed - only include ceed.h in QF source


# e8001fe0 07-Jul-2022 Jed Brown <jed@jedbrown.org>

Merge pull request #1009 from CEED/jrwrigh/dirichlet_with_libceed

Fluids - Use libCEED to compute Dirichlet boundary conditions


# 3b0d37b7 07-Jul-2022 Jed Brown <jed@jedbrown.org>

{cuda,hip}/gen: fix incorrect quadrature points when all bases are collocated

https://github.com/CEED/libCEED/pull/1009#issuecomment-1176751436

Co-authored-by: Natalie Beams <nbeams@icl.utk.edu>


# c9d492da 23-Jun-2022 Jeremy L Thompson <jeremy@jeremylt.org>

Merge pull request #1006 from CEED/jeremy/gen-all-collo-fix

Fix /gpu/*/gen backends for op with all CEED_BASIS_COLLOCATED


# 1d47fde2 22-Jun-2022 Jeremy L Thompson <jeremy@jeremylt.org>

gpu - fix /gpu/*/gen backends for op with all CEED_BASIS_COLLOCATED


# 3c4b7af6 23-May-2022 Jed Brown <jed@jedbrown.org>

Merge branch 'main' into jed/fluids-jacobian

* main:
Fluids - Add STG inflow (#868)
ci - fix Nek5000 testing


# ba6664ae 22-May-2022 James Wright <james@jameswright.xyz>

Fluids - Add STG inflow (#868)

* doc(fluids): Add STG equations

* doc(fluids): Add basic data flow for STG

* doc(fluids): Add Shur et al. 2014 STG paper to bib

* doc(fluids): Specify STG in

Fluids - Add STG inflow (#868)

* doc(fluids): Add STG equations

* doc(fluids): Add basic data flow for STG

* doc(fluids): Add Shur et al. 2014 STG paper to bib

* doc(fluids): Specify STG inputs files, misc additions

* doc(fluids): Add intro for STG section

* fix(fluids): Add #include ceed.h for qfunctions

- In the spirit of "include what you use"

* feat(fluids): Start work on stg_shur14.h

* doc-fix: Correct kappa_min definition

* Move STG setup functions to problems/stg_shur14

* feat: Add cholesky decomposition function

* fix: Correct stg_ctx malloc, reorganize creation

Co-authored-by: Jed Brown <jed@jedbrown.org>

* fix(fluids): Correct return values of functions

* style: Fix up style

* feat(fluids): Get file paths from PetscOptions

- Also convert SetupSTGContext to return PetscErrorCode

* fix(fluids): Correct stg_ctx dereferencing

- Also move to size_t for type of the offsets

* feat(fluids): Add funcs for processing STG*.dat files

* feat(fluids): Move to PetscOptions* for STG flags

* feat: Use `PetscMax` instead of macro

* fix(fluids): Correct/Refactor file reading functions

- Move to `PetscSynchronizedFGets`
- Remove `inline`
- Pass `comm` between functions
- Add `OpenPHASTADatFile` to DRY

* docs(fluids): Fix equation typo

* fix(fluids): Correct calculation of kappa

* feat(fluids): Complete STGShur14_Calc

* feat(fluids): Add InterpolateProfile helper func

* feat(fluids): Add CalcSpectrum helper func

* feat(fluids): Add to STGShur14_Calc qfunction

* fix: Add M_PI, Update SETERRQ functions

- Also update style

* fix: Correct interpolation outside of datarange

* fix: Add missing definition for ke in CalcSpectrum

* feat: Migrate context and func signatures, Misc

- Create SetupSTGContext to be run in another Setup_____Context
function
- Migrate STGShur14Context, CreateSTGContext, and SetupSTGContext
signatures to navierstokes.h
- Add STG contexts to Physics and CeedData
- Add missing CHKERRQ to PetscFClose
- Move to SPDX license headers

* examples/fluids: Pass solution time via context label

* feat: Implement STG boundary integral

- Add theta0 and implicit members to STGShur14Context
- Tested via implementation to the blasius BL problem (though this will
probably go against the code history)

* feat: Fix STG Stuff

* feat: Implement STG inflow for blasius BL

- Note that fluctuations are turned off in this case

* examples/fluids: Add stg_mean_only flag

* examples/fluids: Check cholesky decomp for nans

- Also correct locaiton of cholesky decomposition in ReadSTGInflow()

* examples/fluids: Correct STG documentation

- Missing a 2 sqrt(3/2) factor and didn't take square root of q

* examples/fluids: Fix STGShur14_Calc

- Given the calculated spectrum, calculation of v' and u' verified
against python implementation (which was validated previously against
PHASTA)

* examples/fluids: Calc dXdx for boundary QFunctions

- Also calculate h from the dXdx in STGShur14_Inflow
- Replace h[0] result with constant dx spacing

* examples/fluids: Fix STG Spectra calcualtion

* examples/fluids: Fix build errors

- Ran into an include cycle collision that resulted in over-defining
SetupContext in advection.c
- newtonian_types.h (which has SetupContext defined) ->
stg_shur14_type.h -> navierstokes.h -> advection.c

* examples/fluids: Update and fix documentation

* examples/fluids: Correct dXdx comment, leave TODO

Co-authored-by: Jed Brown <jed@jedbrown.org>

* examples/fluids: Minor bib citation edits

Co-authored-by: Jed Brown <jed@jedbrown.org>

* examples/fluids: Add STGInflow.dat, fix blasius.yaml

* examples/fluids: int -> PetscInt | CeedInt

* examples/fluids: Style

* examples/fluids: Make Boolean names verb_noun format

- Also changes the stg flag to `-stg_use`

* examples/fluids: Add STG test

* examples/fluids: Style fix up

* examples/fluids: Update docs

* examples/fluids: Implement weakT option for STG

* examples/fluids: Fix casting for ROCm

* examples/fluids: avoid PETSc dependency in qfunctions

* examples/fluids: header cleanup

* backends/hip: avoid redundant inline

* examples/fluids: avoid VLA in qfunctions

GPUs don't like VLA and some compilers reject it when targeting GPUs.

* examples/fluids: Create STG_NMODES_MAX

* examples/fluids: Refactor stg setup out of blasius.c

* examples/fluids: Fix misc GPU bugs

Co-authored-by: Jed Brown <jed@jedbrown.org>

show more ...


# 4345bdd5 20-Mar-2022 Jeremy L Thompson <jeremy@jeremylt.org>

Merge pull request #920 from CEED/jeremy/conversion

Explicit casting of vector sizes


# 539ec17d 20-Mar-2022 Jeremy L Thompson <jeremy@jeremylt.org>

gpu - fix error handling in size conversion


# ce18bed9 17-Mar-2022 Jeremy L Thompson <jeremy@jeremylt.org>

Merge pull request #858 from CEED/jeremy/dump-copy-stuff

Strip redundant/outdated license info duplication


# 3d8e8822 17-Mar-2022 Jeremy L Thompson <jeremy@jeremylt.org>

minor - update copyright headers


# 60224bc5 14-Mar-2022 Jeremy L Thompson <jeremy@jeremylt.org>

Merge pull request #913 from CEED/jeremy/coo-ptrdiff

Create CeedSize as ptrdiff_t


# e79b91d9 11-Mar-2022 Jeremy L Thompson <jeremy@jeremylt.org>

rstr - use CeedSize for l_size


# f99981a3 25-Feb-2022 Jed Brown <jed@jedbrown.org>

Merge pull request #893 from CEED/natalie/more-hip-launch-bounds

HIP: add atomics flag and more kernel launch bounds for performance improvements


# 37c3b1cf 24-Feb-2022 nbeams <246972+nbeams@users.noreply.github.com>

Change to more specific name for hip-gen block size function


# b3e1519b 31-Jan-2022 nbeams <246972+nbeams@users.noreply.github.com>

Add launch bounds to hip-gen operator kernel


# 51d630a3 24-Dec-2021 Jeremy L Thompson <jeremy@jeremylt.org>

Merge pull request #864 from CEED/jeremy/gpu-templates

GPU - pull quoted kernels into separate files


# 46dc0734 23-Dec-2021 Jeremy L Thompson <jeremy@jeremylt.org>

gpu - improved human-readability of debugging output


# 437930d1 22-Dec-2021 Jeremy L Thompson <jeremy@jeremylt.org>

gpu - pull quoted kernels into separate files


# d92fedf5 22-Dec-2021 Jeremy L Thompson <jeremy@jeremylt.org>

Merge pull request #863 from CEED/jeremy/gpu-jit-code

GPU - separate common code into separate folder


# 0d0321e0 22-Dec-2021 Jeremy L Thompson <jeremy@jeremylt.org>

style - consistent nameing and style for gpu backends


# 7fcac036 22-Dec-2021 Jeremy L Thompson <jeremy@jeremylt.org>

gpu - split common cuda/hip data into separate folder


# d0dee30e 19-Nov-2021 Jeremy L Thompson <jeremy@jeremylt.org>

Merge pull request #840 from CEED/jeremy/env-debug

Macro for Debug without Ceed Context


123456