History log of /petsc/src/mat/tests/ex5.c (Results 1 – 25 of 39)
Revision Date Author Comments
# 732aec7a 22-Sep-2024 Satish Balay <balay@mcs.anl.gov>

Merge branch 'jolivet/remove-cast' into 'main'

Use NULL or nullptr instead of casted 0

See merge request petsc/petsc!7857


# c8025a54 21-Sep-2024 Pierre Jolivet <pierre@joliv.et>

Use NULL or nullptr instead of casted 0


# 3b91a372 26-Mar-2024 Satish Balay <balay@mcs.anl.gov>

Merge branch 'hongzh/sell-hip' into 'main'

Add SELLHIP

See merge request petsc/petsc!7338


# 773bf0f6 05-Mar-2024 Hong Zhang <hongzhang@anl.gov>

Add SELLHIP

- The HIP kernels are converted directly from their CUDA version
- AMD GPUs and NVIDIA GPUs use different warp sizes. We set the warp size to 64 by default for AMD GPUs to faciliate comp

Add SELLHIP

- The HIP kernels are converted directly from their CUDA version
- AMD GPUs and NVIDIA GPUs use different warp sizes. We set the warp size to 64 by default for AMD GPUs to faciliate compile-time code optimization

show more ...


# e8e8640d 26-Sep-2023 Satish Balay <balay@mcs.anl.gov>

Merge branch 'jolivet/rm-first-empty-line' into 'main'

Remove first and last empty lines

See merge request petsc/petsc!6892


# 92bec4ee 26-Sep-2023 Pierre Jolivet <pierre@joliv.et>

Remove first and last empty lines


# dd874c20 10-Apr-2023 Satish Balay <balay@mcs.anl.gov>

Merge branch 'hongzh/sell-cuda' into 'main'

SELL-based SpMV

See merge request petsc/petsc!3428


# 5f9962ee 08-Apr-2023 Hong Zhang <hongzhang@anl.gov>

Add more tests


# 8711c661 01-Apr-2023 Hong Zhang <hongzhang@anl.gov>

Guard against complex build for unsupported kernels


# 90d2215b 12-Jan-2021 Hong Zhang <hongzhang@anl.gov>

Add the load-balancing kernel for MatMultAdd_SeqSELL and fine tune the heuristic

Kernel7 is significantly slower than kernel9x for the following two cases:
- nrows is too small. Kernel7 uses 2 threa

Add the load-balancing kernel for MatMultAdd_SeqSELL and fine tune the heuristic

Kernel7 is significantly slower than kernel9x for the following two cases:
- nrows is too small. Kernel7 uses 2 threads per row (assuming sliceheight=16), it does not fully utilize the GPU if nrows < 100K.
- maxslicewidth is too big.

Thanks-to: Peng Wang <penwang@nvidia.com>

show more ...


# 2d1451d4 09-Jan-2020 Hong Zhang <hongzhang@anl.gov>

Initial commit for porting SELL to GPU

- Add tiled SPMV and basic SpMVfor SeqSELL
- Tested in serial
- Offloadmask is used to determine when the matrix should be copied to GPU
- Use different slice

Initial commit for porting SELL to GPU

- Add tiled SPMV and basic SpMVfor SeqSELL
- Tested in serial
- Offloadmask is used to determine when the matrix should be copied to GPU
- Use different slice height for CUDA version
- By checking the nonzerostate, PETSc can decide if the whole matrix need to be copied or just the values need to be copied
- Make the convert function public so that the very slow MatConvert_Basic can be avoided sometimes. E.g. one can use a two-step convert method: AIJ->SELL,SELL->SELLCUDA instead of the direct convert AIJ->SELLCUDA
- Make the FLOPS count for SELL same as that for AIJCUSPARSE.
- MatDisAssemble is not needed.
- Change slice height from 32 to 16 for GPU
- To overlap communication with MatMult, VecScatterBegin() should be called before MatMult() for the diagonal part.
- SLICE_HEIGHT is defined to be 32 to match the warp size of GPU. For other cases, it is still 8.

Funded-by:
Project: PETSc for GPU
Time: 42 hours
Reported-by:
Thanks-to:

show more ...


# b047e4b5 17-Feb-2023 Satish Balay <balay@mcs.anl.gov>

Merge branch 'barry/2022-09-12/fix-mat-preallocation' into 'main'

try to build efficient hash table allocation directly into Mat

See merge request petsc/petsc!5621


# 23a3927d 14-Dec-2022 Barry Smith <bsmith@mcs.anl.gov>

Do not name an unnamed object unless necessary while viewing matrices

Commit-type: bug, usability
/spend 5m


# 061e922f 22-Sep-2022 Satish Balay <balay@mcs.anl.gov>

Merge branch 'jacobf/2022-09-21/2-bike-2-shed' into 'main'

Feature: Bicycle Storage Facility 2

See merge request petsc/petsc!5661


# d71ae5a4 21-Sep-2022 Jacob Faibussowitsch <jacob.fai@gmail.com>

source code format changes due to .clang-format changes


# 38f67375 27-Aug-2022 Satish Balay <balay@mcs.anl.gov>

Merge branch 'jolivet/fix-style-one-liners' into 'main'

Remove braces from one-liners

See merge request petsc/petsc!5557


# 48a46eb9 27-Aug-2022 Pierre Jolivet <pierre@joliv.et>

Remove braces from one-liners


# 47dfd8b8 26-Aug-2022 Satish Balay <balay@mcs.anl.gov>

Merge branch 'jczhang/for-jlse-test' into 'main'

Turn on remaining petsc/kokkos tests for sycl

See merge request petsc/petsc!5556


# dcfd994d 25-Jul-2022 Junchao Zhang <jczhang@mcs.anl.gov>

Tests: turn on some kokkos tests for sycl


# 58d68138 23-Aug-2022 Satish Balay <balay@mcs.anl.gov>

Merge branch 'barry/2022-08-21/clang-format-source' into 'main'

format repository with clang-format

See merge request petsc/petsc!5541


# 9371c9d4 22-Aug-2022 Satish Balay <balay@mcs.anl.gov>

clang-format: convert PETSc sources to comply with clang-format


# 5cab5458 26-Jul-2022 Satish Balay <balay@mcs.anl.gov>

Merge branch 'barry/2022-07-23/add-petscbeginuser' into 'main'

Add PetscFunctionBeginUser to all PETSc C/C++ examples

See merge request petsc/petsc!5470


# 327415f7 23-Jul-2022 Barry Smith <bsmith@mcs.anl.gov>

Add PetscFunctionBeginUser to all PETSc C/C++ examples

Now the stack frames will contain the main program and the correct line numbers in them

git ls-files | egrep "(tutorials|tests)" | xargs sed -

Add PetscFunctionBeginUser to all PETSc C/C++ examples

Now the stack frames will contain the main program and the correct line numbers in them

git ls-files | egrep "(tutorials|tests)" | xargs sed -i "s?\(PetscCall(PetscInitialize(&argc\)?PetscFunctionBeginUser;\n \1?g"

Commit-type: error-checking, testing-fix
/spend 15m

show more ...


# f882803c 26-Mar-2022 Satish Balay <balay@mcs.anl.gov>

Merge branch 'jacobf/2022-02-23/variadic-chkerr' into 'main'

Variadic CHKERRQ()

See merge request petsc/petsc!4889


# 9566063d 25-Mar-2022 Jacob Faibussowitsch <jacob.fai@gmail.com>

The great renaming:

- CHKERRQ() -> PetscCall()
- CHKERRV() -> PetscCallVoid()
- CHKERRMPI() -> PetscCallMPI()
- CHKERRABORT() -> PetscCallAbort()
- CHKERRCONTINUE() -> PetscCallContinue()
- CHKERRXX

The great renaming:

- CHKERRQ() -> PetscCall()
- CHKERRV() -> PetscCallVoid()
- CHKERRMPI() -> PetscCallMPI()
- CHKERRABORT() -> PetscCallAbort()
- CHKERRCONTINUE() -> PetscCallContinue()
- CHKERRXX() -> PetscCallThrow()
- CHKERRCXX() -> PetscCallCXX()
- CHKERRCUDA() -> PetscCallCUDA()
- CHKERRCUBLAS() -> PetscCallCUBLAS()
- CHKERRCUSPARSE() -> PetscCallCUSPARSE()
- CHKERRCUSOLVER() -> PetscCallCUSOLVER()
- CHKERRCUFFT() -> PetscCallCUFFT()
- CHKERRCURAND() -> PetscCallCURAND()
- CHKERRHIP() -> PetscCallHIP()
- CHKERRHIPBLAS() -> PetscCallHIPBLAS()
- CHKERRHIPSOLVER() -> PetscCallHIPSOLVER()
- CHKERRQ_CEED() -> PetscCallCEED()
- CHKERR_FORTRAN_VOID_FUNCTION() -> PetscCallFortranVoidFunction()
- CHKERRMKL() -> PetscCallMKL()
- CHKERRMMG() -> PetscCallMMG()
- CHKERRMMG_NONSTANDARD() -> PetscCallMMG_NONSTANDARD()
- CHKERRCGNS() -> PetscCallCGNS()
- CHKERRPTSCOTCH() -> PetscCallPTSCOTCH()
- CHKERRSTR() -> PetscCallSTR()
- CHKERRTC() -> PetscCallTC()

show more ...


12