Merge remote-tracking branch 'origin/release'
MAT/SF: add some comments to code
Doc: fix MATMPIAIJCUSPARSE manualReported-by: Victor Eijkhout <eijkhout@tacc.utexas.edu>
Mat: remove the unneeded +1 in memory allocation for garray
MatSetValuesDevice: Cleanup and simplify code, including exampleUser reported crash of example code. Kernel was passed an ierr that lived in CPU memoryMatSetValuesDevice: do not include private h
MatSetValuesDevice: Cleanup and simplify code, including exampleUser reported crash of example code. Kernel was passed an ierr that lived in CPU memoryMatSetValuesDevice: do not include private headers from public headersFeature: MatSetValuesDevice determines automatically from the context (where it is included from) if it is being used from C, CUDA, or Kokkos, PETSC_DEVICE_FUNC_DEC no longer needs to be set before including petscaijdevice.hFeature: MatSetValuesDevice() now ignores all values outside the global column range.PetscSplitCSRDataStructure is now a pointer, not a struct, like most PETSc objects, please leave it that way.Fix all uses of CTABLE that were related to the original MatSetValuesDevice()Have atomicAdd use Kokkos atomic-add with CPU build when building with Kokkos.Cuda should now work with --download-openmpi, this is done by updating updateCompilers() to rerun portions of packages/cuda.py after the compilers are reset to use MPI wrappers. This is needed because the resetting of the compilers removes all the compiler flags and packages/cuda.py sets certain values into these flags that was previously lost.Add MPICXX_INCLUDES, MPICXX_LIBS to fix compile targets for Kokkos examples'make check' now runs properly for Kokkos test of src/snes/ex3k, fixed bug in the makefile wrt MPI_IS_MPIUNI checkTesting makefile rules: add ex*cu binaries to clean ruleReported-by: Sam Fagbemi <samkorede24@gmail.com>Thanks-to: Stefano Zampini <stefano.zampini@gmail.com>Thanks-to: Mark Adams <mfadams@lbl.gov>/spend 16h
show more ...
Remove use of PETSC_SKIP_CXX_COMPLEX_FIX in petsc source code
Remove unneeded WaitForCUDA()
Rename -mat_cusparse_transgen to -mat_form_explicit_transpose
Add SF NVSHMEM support
Add non-null PetscDefault{Cuda,Hip}Stream
CHKERRQ() -> CHKERRMPI()
MatCreate: When PETSc is configure with device, set boundtocpu to true at creation time
Mat: Move COO events out of CUSPARSE classFix a few typos in the code
MatMPIAIJ: move generic code for MatMat product to base class
MatMPIAIJSetPreallocation_ : minor fixes for cusparse and kokkos
MatAIJCUSPARSESetGenerateTranspose: convenience function for seq and mpi
MATMPIAIJCUSPARSE: add support for sparse MatMat operations
MatSetValuesCOO: use cuda memory if possible
MatMPIAIJGetLocalMatMerge: support for merging two matrices
MatSetValuesCOO: perform addition of repeated entries when INSERT_VALUES is specifieduse temporary data instead of persistent storage in cusparse implementation
Convert MPI error type to PETSc error with string message for all MPI callsNow PETSc examples will ONLY return PETSc error codes and never MPI error codes directly so we can understand and post-pro
Convert MPI error type to PETSc error with string message for all MPI callsNow PETSc examples will ONLY return PETSc error codes and never MPI error codes directly so we can understand and post-process their errors better.The test harness will now automatically retry tests that fail with MPI, this may help with Intel MPI that produces seemingly random failures.Commit-type: error-checking/spend 30m
checkbadSource: apply rules to *.cu *.cpp sources, and expand CHKERRQ check to CHKERR(Q|MPI|CUDA|CUBLAS|CUSPARSE)
Adding Cuda and Kokkos assembly. Added Device assembly to Landau operator. Added Kokkos test mat/ex5k.
MATCUSPARSE: Implement fast assembly from COO data
12345678