History log of /petsc/src/mat/tests/output/ex5_55.out (Results 1 – 3 of 3)
Revision Date Author Comments
# dd874c20 10-Apr-2023 Satish Balay <balay@mcs.anl.gov>

Merge branch 'hongzh/sell-cuda' into 'main'

SELL-based SpMV

See merge request petsc/petsc!3428


# 5f9962ee 08-Apr-2023 Hong Zhang <hongzhang@anl.gov>

Add more tests


# 90d2215b 12-Jan-2021 Hong Zhang <hongzhang@anl.gov>

Add the load-balancing kernel for MatMultAdd_SeqSELL and fine tune the heuristic

Kernel7 is significantly slower than kernel9x for the following two cases:
- nrows is too small. Kernel7 uses 2 threa

Add the load-balancing kernel for MatMultAdd_SeqSELL and fine tune the heuristic

Kernel7 is significantly slower than kernel9x for the following two cases:
- nrows is too small. Kernel7 uses 2 threads per row (assuming sliceheight=16), it does not fully utilize the GPU if nrows < 100K.
- maxslicewidth is too big.

Thanks-to: Peng Wang <penwang@nvidia.com>

show more ...