Fortran 90: fully embrace After 34 years!- deprecate use of 'F90' in Fortran function names- use Fortran pointers when appropriate- the new Fortran API is not backward compatible with previous ve
Fortran 90: fully embrace After 34 years!- deprecate use of 'F90' in Fortran function names- use Fortran pointers when appropriate- the new Fortran API is not backward compatible with previous versions!- also clean up inconsistent PETSc code detected by new Fortran generation tools- drop use of bfort- automatically generate all the Fortran PETSc objects, enums etc from the include files- generate most of the Fortran interface definitions and functions from the source code- simplify the number and organization of Fortran modulesCo-authored-by: Jose E. Roman <jroman@dsic.upv.es>
show more ...
Reuse MPISELL operations for SELLCUDA and SELLHIP
LIBBASE is no longer used in make so remove it
Ensure no leading white spaces in front of .seealso:
Rename rules.doc and rules.utils because GitLab treats the former as a MS Word document.Thanks-to: Jed Brown
Convert all header guards to pragma once
Fix some malformed if !defined() header guards
Add static to internal functions
Lint apply: mat
docs: additional chapter_ -> ch_ change in main after merge of release changes from !6520
Finish MPICUDASELL
Initial commit for porting SELL to GPU- Add tiled SPMV and basic SpMVfor SeqSELL- Tested in serial- Offloadmask is used to determine when the matrix should be copied to GPU- Use different slice
Initial commit for porting SELL to GPU- Add tiled SPMV and basic SpMVfor SeqSELL- Tested in serial- Offloadmask is used to determine when the matrix should be copied to GPU- Use different slice height for CUDA version- By checking the nonzerostate, PETSc can decide if the whole matrix need to be copied or just the values need to be copied- Make the convert function public so that the very slow MatConvert_Basic can be avoided sometimes. E.g. one can use a two-step convert method: AIJ->SELL,SELL->SELLCUDA instead of the direct convert AIJ->SELLCUDA- Make the FLOPS count for SELL same as that for AIJCUSPARSE.- MatDisAssemble is not needed.- Change slice height from 32 to 16 for GPU- To overlap communication with MatMult, VecScatterBegin() should be called before MatMult() for the diagonal part.- SLICE_HEIGHT is defined to be 32 to match the warp size of GPU. For other cases, it is still 8.Funded-by:Project: PETSc for GPUTime: 42 hoursReported-by:Thanks-to: