Change *.cpp to *.cxx
This MR optimize some MAT[SB]AIJ operations related to the diagonalThe longer term goal is to refactor some Mat_SeqXXX non-numeric code to make it easier to also implement it on GPUs. For example,
This MR optimize some MAT[SB]AIJ operations related to the diagonalThe longer term goal is to refactor some Mat_SeqXXX non-numeric code to make it easier to also implement it on GPUs. For example, MatGetDiagonal() and friendsThis commit- MatMarkDiagonal_SeqAIJ/ELL() - now use change in mat->nonzerostate to determine if it needs to recheck the diagonal locations - sets diagDense flag for a complete diagonal- MatInvertDiagonal_SeqAIJ/ELL() - now uses change in mat->state to determine if diagonal entries and inverses need to be recomputed - name changed to MatInvertDiagonalForSOR_SeqAIJ/ELL() for code clarity- Added MatInvertDiagonal_SeqAIJ_Inode() for use by MatSOR_SeqAIJ_Inode() - now uses change in mat->state to determine if block diagonal entries and inverses need to be recomputed
show more ...
Remove unnecessary braces around one-linersgit grep -lE "[ ]*(if|for|while) \(.*\) {[^;]*;[^;]*}$" -- '*.c' '*.cxx' '*.cu' '*.h' '*.hpp' '*.cpp' | xargs sed -i '' -E 's#([ ]*)(if|for|while) \((.*)\
Remove unnecessary braces around one-linersgit grep -lE "[ ]*(if|for|while) \(.*\) {[^;]*;[^;]*}$" -- '*.c' '*.cxx' '*.cu' '*.h' '*.hpp' '*.cpp' | xargs sed -i '' -E 's#([ ]*)(if|for|while) \((.*)\) {([^;]*);([^;]*)}$#\1\2 \(\3\)\4;\5#'
One-liners from petsc/petsc!5344 and petsc/petsc!5557Slightly reworked regular expressiongit ls-files -z -- '*.c' '*.cxx' '*.cu' '*.h' '*.hpp' '*.cpp' | while IFS= read -r -d '' file; do cat
One-liners from petsc/petsc!5344 and petsc/petsc!5557Slightly reworked regular expressiongit ls-files -z -- '*.c' '*.cxx' '*.cu' '*.h' '*.hpp' '*.cpp' | while IFS= read -r -d '' file; do cat $file | tr '\n' '\r' | sed -E 's/\r([ ]*)(for|if|while|else) ([^\r]*)\{\r[ ]*Petsc([a-zA-Z]*)\(([^\r]*)\);\r[ ]*\}\r/\r\1\2 \3Petsc\4(\5);\r/g' | tr '\r' '\n' > ${file}.joe; mv ${file}.joe ${file}done
Fortran 90: fully embrace After 34 years!- deprecate use of 'F90' in Fortran function names- use Fortran pointers when appropriate- the new Fortran API is not backward compatible with previous ve
Fortran 90: fully embrace After 34 years!- deprecate use of 'F90' in Fortran function names- use Fortran pointers when appropriate- the new Fortran API is not backward compatible with previous versions!- also clean up inconsistent PETSc code detected by new Fortran generation tools- drop use of bfort- automatically generate all the Fortran PETSc objects, enums etc from the include files- generate most of the Fortran interface definitions and functions from the source code- simplify the number and organization of Fortran modulesCo-authored-by: Jose E. Roman <jroman@dsic.upv.es>
PetscCeilInt: inline method
Mat: fix SELL HIP warnings on pragma unroll with rocm-5.4.0/path/to/sellhip.hip.cpp:309:3: warning: loop not unrolled: the optimizer was unable to perform the requested transformation; the transfor
Mat: fix SELL HIP warnings on pragma unroll with rocm-5.4.0/path/to/sellhip.hip.cpp:309:3: warning: loop not unrolled: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Wpass-failed=transform-warning] for (int offset = WARP_SIZE / 2; offset >= sliceheight; offset /= 2) { t += __shfl_down(t, offset); }
Add SELLHIP- The HIP kernels are converted directly from their CUDA version- AMD GPUs and NVIDIA GPUs use different warp sizes. We set the warp size to 64 by default for AMD GPUs to faciliate comp
Add SELLHIP- The HIP kernels are converted directly from their CUDA version- AMD GPUs and NVIDIA GPUs use different warp sizes. We set the warp size to 64 by default for AMD GPUs to faciliate compile-time code optimization