| fa046f9f | 12-Sep-2021 |
Junchao Zhang <jczhang@mcs.anl.gov> |
MatProduct: need to check symmetry between symbolic and numeric
Suppose we have code: MatTransposeMatMult(A,B,MAT_INITIAL_MATRIX,fill,C); MatTransposeMatMult(E,B,MAT_REUSE_MATRIX,fill,C);
If A
MatProduct: need to check symmetry between symbolic and numeric
Suppose we have code: MatTransposeMatMult(A,B,MAT_INITIAL_MATRIX,fill,C); MatTransposeMatMult(E,B,MAT_REUSE_MATRIX,fill,C);
If A is symmetric and E is not, but E has the same nonzero pattern as A, then above code is legitimate. If we change the MatProductType in MAT_INITIAL_MATRIX from MATPRODUCT_AtB to MATPRODUCT_AB, then in MAT_REUSE_MATRIX, we need to redo symbolic otherwise C=E^B would be wronlgy computed as C=EB.
show more ...
|
| 042217e8 | 10-Jun-2021 |
Barry Smith <bsmith@mcs.anl.gov> |
MatSetValuesDevice: Cleanup and simplify code, including example
User reported crash of example code. Kernel was passed an ierr that lived in CPU memory
MatSetValuesDevice: do not include private h
MatSetValuesDevice: Cleanup and simplify code, including example
User reported crash of example code. Kernel was passed an ierr that lived in CPU memory
MatSetValuesDevice: do not include private headers from public headers
Feature: MatSetValuesDevice determines automatically from the context (where it is included from) if it is being used from C, CUDA, or Kokkos, PETSC_DEVICE_FUNC_DEC no longer needs to be set before including petscaijdevice.h
Feature: MatSetValuesDevice() now ignores all values outside the global column range.
PetscSplitCSRDataStructure is now a pointer, not a struct, like most PETSc objects, please leave it that way.
Fix all uses of CTABLE that were related to the original MatSetValuesDevice()
Have atomicAdd use Kokkos atomic-add with CPU build when building with Kokkos.
Cuda should now work with --download-openmpi, this is done by updating updateCompilers() to rerun portions of packages/cuda.py after the compilers are reset to use MPI wrappers. This is needed because the resetting of the compilers removes all the compiler flags and packages/cuda.py sets certain values into these flags that was previously lost.
Add MPICXX_INCLUDES, MPICXX_LIBS to fix compile targets for Kokkos examples
'make check' now runs properly for Kokkos test of src/snes/ex3k, fixed bug in the makefile wrt MPI_IS_MPIUNI check
Testing makefile rules: add ex*cu binaries to clean rule
Reported-by: Sam Fagbemi <samkorede24@gmail.com> Thanks-to: Stefano Zampini <stefano.zampini@gmail.com> Thanks-to: Mark Adams <mfadams@lbl.gov>
/spend 16h
show more ...
|