| #
732aec7a
|
| 22-Sep-2024 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'jolivet/remove-cast' into 'main'
Use NULL or nullptr instead of casted 0
See merge request petsc/petsc!7857
|
| #
c8025a54
|
| 21-Sep-2024 |
Pierre Jolivet <pierre@joliv.et> |
Use NULL or nullptr instead of casted 0
|
| #
3b91a372
|
| 26-Mar-2024 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'hongzh/sell-hip' into 'main'
Add SELLHIP
See merge request petsc/petsc!7338
|
| #
773bf0f6
|
| 05-Mar-2024 |
Hong Zhang <hongzhang@anl.gov> |
Add SELLHIP
- The HIP kernels are converted directly from their CUDA version - AMD GPUs and NVIDIA GPUs use different warp sizes. We set the warp size to 64 by default for AMD GPUs to faciliate comp
Add SELLHIP
- The HIP kernels are converted directly from their CUDA version - AMD GPUs and NVIDIA GPUs use different warp sizes. We set the warp size to 64 by default for AMD GPUs to faciliate compile-time code optimization
show more ...
|
| #
e8e8640d
|
| 26-Sep-2023 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'jolivet/rm-first-empty-line' into 'main'
Remove first and last empty lines
See merge request petsc/petsc!6892
|
| #
92bec4ee
|
| 26-Sep-2023 |
Pierre Jolivet <pierre@joliv.et> |
Remove first and last empty lines
|
| #
dd874c20
|
| 10-Apr-2023 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'hongzh/sell-cuda' into 'main'
SELL-based SpMV
See merge request petsc/petsc!3428
|
| #
5f9962ee
|
| 08-Apr-2023 |
Hong Zhang <hongzhang@anl.gov> |
Add more tests
|
| #
8711c661
|
| 01-Apr-2023 |
Hong Zhang <hongzhang@anl.gov> |
Guard against complex build for unsupported kernels
|
| #
90d2215b
|
| 12-Jan-2021 |
Hong Zhang <hongzhang@anl.gov> |
Add the load-balancing kernel for MatMultAdd_SeqSELL and fine tune the heuristic
Kernel7 is significantly slower than kernel9x for the following two cases: - nrows is too small. Kernel7 uses 2 threa
Add the load-balancing kernel for MatMultAdd_SeqSELL and fine tune the heuristic
Kernel7 is significantly slower than kernel9x for the following two cases: - nrows is too small. Kernel7 uses 2 threads per row (assuming sliceheight=16), it does not fully utilize the GPU if nrows < 100K. - maxslicewidth is too big.
Thanks-to: Peng Wang <penwang@nvidia.com>
show more ...
|
| #
2d1451d4
|
| 09-Jan-2020 |
Hong Zhang <hongzhang@anl.gov> |
Initial commit for porting SELL to GPU
- Add tiled SPMV and basic SpMVfor SeqSELL - Tested in serial - Offloadmask is used to determine when the matrix should be copied to GPU - Use different slice
Initial commit for porting SELL to GPU
- Add tiled SPMV and basic SpMVfor SeqSELL - Tested in serial - Offloadmask is used to determine when the matrix should be copied to GPU - Use different slice height for CUDA version - By checking the nonzerostate, PETSc can decide if the whole matrix need to be copied or just the values need to be copied - Make the convert function public so that the very slow MatConvert_Basic can be avoided sometimes. E.g. one can use a two-step convert method: AIJ->SELL,SELL->SELLCUDA instead of the direct convert AIJ->SELLCUDA - Make the FLOPS count for SELL same as that for AIJCUSPARSE. - MatDisAssemble is not needed. - Change slice height from 32 to 16 for GPU - To overlap communication with MatMult, VecScatterBegin() should be called before MatMult() for the diagonal part. - SLICE_HEIGHT is defined to be 32 to match the warp size of GPU. For other cases, it is still 8.
Funded-by: Project: PETSc for GPU Time: 42 hours Reported-by: Thanks-to:
show more ...
|
| #
b047e4b5
|
| 17-Feb-2023 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'barry/2022-09-12/fix-mat-preallocation' into 'main'
try to build efficient hash table allocation directly into Mat
See merge request petsc/petsc!5621
|
| #
23a3927d
|
| 14-Dec-2022 |
Barry Smith <bsmith@mcs.anl.gov> |
Do not name an unnamed object unless necessary while viewing matrices
Commit-type: bug, usability /spend 5m
|
| #
061e922f
|
| 22-Sep-2022 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'jacobf/2022-09-21/2-bike-2-shed' into 'main'
Feature: Bicycle Storage Facility 2
See merge request petsc/petsc!5661
|
| #
d71ae5a4
|
| 21-Sep-2022 |
Jacob Faibussowitsch <jacob.fai@gmail.com> |
source code format changes due to .clang-format changes
|
| #
38f67375
|
| 27-Aug-2022 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'jolivet/fix-style-one-liners' into 'main'
Remove braces from one-liners
See merge request petsc/petsc!5557
|
| #
48a46eb9
|
| 27-Aug-2022 |
Pierre Jolivet <pierre@joliv.et> |
Remove braces from one-liners
|
| #
47dfd8b8
|
| 26-Aug-2022 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'jczhang/for-jlse-test' into 'main'
Turn on remaining petsc/kokkos tests for sycl
See merge request petsc/petsc!5556
|
| #
dcfd994d
|
| 25-Jul-2022 |
Junchao Zhang <jczhang@mcs.anl.gov> |
Tests: turn on some kokkos tests for sycl
|
| #
58d68138
|
| 23-Aug-2022 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'barry/2022-08-21/clang-format-source' into 'main'
format repository with clang-format
See merge request petsc/petsc!5541
|
| #
9371c9d4
|
| 22-Aug-2022 |
Satish Balay <balay@mcs.anl.gov> |
clang-format: convert PETSc sources to comply with clang-format
|
| #
5cab5458
|
| 26-Jul-2022 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'barry/2022-07-23/add-petscbeginuser' into 'main'
Add PetscFunctionBeginUser to all PETSc C/C++ examples
See merge request petsc/petsc!5470
|
| #
327415f7
|
| 23-Jul-2022 |
Barry Smith <bsmith@mcs.anl.gov> |
Add PetscFunctionBeginUser to all PETSc C/C++ examples
Now the stack frames will contain the main program and the correct line numbers in them
git ls-files | egrep "(tutorials|tests)" | xargs sed -
Add PetscFunctionBeginUser to all PETSc C/C++ examples
Now the stack frames will contain the main program and the correct line numbers in them
git ls-files | egrep "(tutorials|tests)" | xargs sed -i "s?\(PetscCall(PetscInitialize(&argc\)?PetscFunctionBeginUser;\n \1?g"
Commit-type: error-checking, testing-fix /spend 15m
show more ...
|
| #
f882803c
|
| 26-Mar-2022 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'jacobf/2022-02-23/variadic-chkerr' into 'main'
Variadic CHKERRQ()
See merge request petsc/petsc!4889
|
| #
9566063d
|
| 25-Mar-2022 |
Jacob Faibussowitsch <jacob.fai@gmail.com> |
The great renaming:
- CHKERRQ() -> PetscCall() - CHKERRV() -> PetscCallVoid() - CHKERRMPI() -> PetscCallMPI() - CHKERRABORT() -> PetscCallAbort() - CHKERRCONTINUE() -> PetscCallContinue() - CHKERRXX
The great renaming:
- CHKERRQ() -> PetscCall() - CHKERRV() -> PetscCallVoid() - CHKERRMPI() -> PetscCallMPI() - CHKERRABORT() -> PetscCallAbort() - CHKERRCONTINUE() -> PetscCallContinue() - CHKERRXX() -> PetscCallThrow() - CHKERRCXX() -> PetscCallCXX() - CHKERRCUDA() -> PetscCallCUDA() - CHKERRCUBLAS() -> PetscCallCUBLAS() - CHKERRCUSPARSE() -> PetscCallCUSPARSE() - CHKERRCUSOLVER() -> PetscCallCUSOLVER() - CHKERRCUFFT() -> PetscCallCUFFT() - CHKERRCURAND() -> PetscCallCURAND() - CHKERRHIP() -> PetscCallHIP() - CHKERRHIPBLAS() -> PetscCallHIPBLAS() - CHKERRHIPSOLVER() -> PetscCallHIPSOLVER() - CHKERRQ_CEED() -> PetscCallCEED() - CHKERR_FORTRAN_VOID_FUNCTION() -> PetscCallFortranVoidFunction() - CHKERRMKL() -> PetscCallMKL() - CHKERRMMG() -> PetscCallMMG() - CHKERRMMG_NONSTANDARD() -> PetscCallMMG_NONSTANDARD() - CHKERRCGNS() -> PetscCallCGNS() - CHKERRPTSCOTCH() -> PetscCallPTSCOTCH() - CHKERRSTR() -> PetscCallSTR() - CHKERRTC() -> PetscCallTC()
show more ...
|