ex5_55.out - OpenGrok history log for /petsc/src/mat/tests/output/ex5

Revision	Date	Author	Comments
# dd874c20	10-Apr-2023	Satish Balay <balay@mcs.anl.gov>	Merge branch 'hongzh/sell-cuda' into 'main' SELL-based SpMV See merge request petsc/petsc!3428
# 5f9962ee	08-Apr-2023	Hong Zhang <hongzhang@anl.gov>	Add more tests
# 90d2215b	12-Jan-2021	Hong Zhang <hongzhang@anl.gov>	Add the load-balancing kernel for MatMultAdd_SeqSELL and fine tune the heuristic Kernel7 is significantly slower than kernel9x for the following two cases: - nrows is too small. Kernel7 uses 2 threa Add the load-balancing kernel for MatMultAdd_SeqSELL and fine tune the heuristic Kernel7 is significantly slower than kernel9x for the following two cases: - nrows is too small. Kernel7 uses 2 threads per row (assuming sliceheight=16), it does not fully utilize the GPU if nrows < 100K. - maxslicewidth is too big. Thanks-to: Peng Wang <penwang@nvidia.com> show more ...