One-liners from petsc/petsc!5344 and petsc/petsc!5557Slightly reworked regular expressiongit ls-files -z -- '*.c' '*.cxx' '*.cu' '*.h' '*.hpp' '*.cpp' | while IFS= read -r -d '' file; do cat
One-liners from petsc/petsc!5344 and petsc/petsc!5557Slightly reworked regular expressiongit ls-files -z -- '*.c' '*.cxx' '*.cu' '*.h' '*.hpp' '*.cpp' | while IFS= read -r -d '' file; do cat $file | tr '\n' '\r' | sed -E 's/\r([ ]*)(for|if|while|else) ([^\r]*)\{\r[ ]*Petsc([a-zA-Z]*)\(([^\r]*)\);\r[ ]*\}\r/\r\1\2 \3Petsc\4(\5);\r/g' | tr '\r' '\n' > ${file}.joe; mv ${file}.joe ${file}done
show more ...
Mat: Fix and improve the performance of dense matrix multiplicationMat: Add MATDENSEFROMVECTYPE constructor typeNow in a tests set you can do``` testset: args: -mat_type densefromvectype
Mat: Fix and improve the performance of dense matrix multiplicationMat: Add MATDENSEFROMVECTYPE constructor typeNow in a tests set you can do``` testset: args: -mat_type densefromvectype test: test_cuda requires: cuda args: -vec_type veccuda test: test_hip requires: hip args: -vec_type vechip```(This assumes that you call `MatSetVecType()` before you call`MatSetFromOptions()`)Mat_MPIDense: Cache offsets of MatDenseGetSubMatrix() to avoid communication in more casesMat: Add missing implementations for internal "MatMultColumnRange()" interfaceMat_MPIDense: Fix the zeroing of buffers in multiplication routinesMat_MPIDense: Add optimization of MatMatMult routines when all columns are owned by rank 0The communication for intermediate buffers can be handled with allreduce / bcast operations,but we use the PetscSF matvec context instead of MPI routines directly so that we willuse gpu-aware MPI if possible.
Partially revert !8099 for non-blocking collectives
Remove unneeded PetscMPIIntCast() for routines using PetscCountFix #1661
Brain dead fixes for useless casts
Add support to MPIU_Allreduce to prevent int overflow with a single integer argument
Add to CI compilers flags '-Wconversion', '-Wno-sign-conversion', '-Wno-float-conversion', '-Wno-implicit-float-conversion']Also fix the code to repository to compile cleanly with these flags in th
Add to CI compilers flags '-Wconversion', '-Wno-sign-conversion', '-Wno-float-conversion', '-Wno-implicit-float-conversion']Also fix the code to repository to compile cleanly with these flags in the CI
Config: get rid of PETSC_HAVE_OMPI_MAJOR_VERSION and include it in petscpkg_version.h
SF: need to sync the stream before MPI send even when there is nothing to send but has something to receiveThere might be pending gpu operations on the receive buffer. Without synchronization, say
SF: need to sync the stream before MPI send even when there is nothing to send but has something to receiveThere might be pending gpu operations on the receive buffer. Without synchronization, say we proceed to MPI_Waitall(). MPI might stage on host and do a H2D copy on an internal stream on the receive side. Previous gpu operations COULD happen after the H2D copy, causing a write-after-write reorder violation!
Merge remote-tracking branch 'origin/release'
Minor fixes to website material
LIBBASE is no longer used in make so remove it
Merge branch 'barry/2023-10-25/rename-rules-doc' into 'main'Rename rules.doc and rules.utils because GitLab treats the former as a MS Word document.See merge request petsc/petsc!6965
PetscSF: add MPI-4.0 persistent neighborhood collectives support
PetscSF: refactor and modulize the code to better support persistent communication
Rename rules.doc and rules.utils because GitLab treats the former as a MS Word document.Thanks-to: Jed Brown
Convert all header guards to pragma once
Fix PetscCallMPI(MPI_Allreduce()) with PETSc types
non-test and tutorial makefiles only need rules.doc not the full rulesCommit-type: documentation
Only makefiles in the test and tutorial directories need lib/petsc/conf/testCommit-type: housekeeping
Remove now unneeded SOURCE* variables from makefilesCommit-type: configure, housekeeping
Remove empty preprocessor variables
Remove unneeded declarations of LOCDIR from all the makefilesCommit-type:documentation
Make PetscErrorCode a non-discardable enum
PetscSF: optimize SFALLGATHERV for the one-to-all patternthis happens with ML models using data parallelism having replicated local parametersAdd test
123