| #
e8c0849a
|
| 20-Nov-2025 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'barry/2025-10-18/optimize-aij-ops' into 'main'
Refactor handling of diagonal marking in *AIJ and SELL matrices
See merge request petsc/petsc!8797
|
| #
421480d9
|
| 19-Oct-2025 |
Barry Smith <bsmith@mcs.anl.gov> |
- Replace MatMissingDiagonal() and MatMarkDiagonals_SeqXXX() with MatGetDiagonalMarkers_SeqXXX()
- Mat_SeqXXX->diag is not automatically created during MatAssemblyEnd() saving memory and time
- Replace MatMissingDiagonal() and MatMarkDiagonals_SeqXXX() with MatGetDiagonalMarkers_SeqXXX()
- Mat_SeqXXX->diag is not automatically created during MatAssemblyEnd() saving memory and time
- Accessing Mat_SeqXXX->diag now requires the use of MatGetDiagonalMarkers_SeqXXX() except when the current values are known to be correct; for example during numerical factorizations and solves
- Mat_SeqXXX->diag is now never shared among matrices; hence the free_diag flag is gone. That was always a risky proposition since any of the owning matrices could chang the values thus making them incorrect for other owners.
show more ...
|
| #
b31b2f82
|
| 10-Nov-2025 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'barry/2025-10-24/fix-dmshellsetdestroyctx' into 'main'
Finish converting the function prototypes of destroy for contexts to use PetscCtxDestroyFn
See merge request petsc/petsc!8810
|
| #
cc1eb50d
|
| 27-Oct-2025 |
Barry Smith <bsmith@mcs.anl.gov> |
Change names of Mat_XXX product contexts to MatProductCtx_XXX for code maintainability
Update destroy callback of all MatProductCtx and MatShellSetMatProductOperation() to use PetscCtxDestroyFn
|
| #
76f14e82
|
| 11-Aug-2025 |
Satish Balay <balay@mcs.anl.gov> |
Merge remote-tracking branch 'origin/release'
|
| #
82ff078f
|
| 05-Aug-2025 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'jczhang/2025-08-01/fix-MatCreateMPIAIJWithSeqAIJ' into 'release'
Change the prototype of a ctor of Mat_SeqAIJKokkos
See merge request petsc/petsc!8626
|
| #
ecd797f4
|
| 02-Aug-2025 |
Junchao Zhang <jczhang@anl.gov> |
Mat: change the prototype of a ctor of Mat_SeqAIJKokkos
To correctly mark the a,i,j on host are allocated by Mat_SeqAIJKokkos, so don't use them when Mat_SeqAIJKokkos is deleted.
close #1798
|
| #
934c28dd
|
| 22-Jul-2025 |
Satish Balay <balay@mcs.anl.gov> |
Merge remote-tracking branch 'origin/release'
|
| #
09117800
|
| 22-Jul-2025 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'zach/fixes-gpu-mi300a' into 'release'
MATHYPRE and Kokkos Fixes
See merge request petsc/petsc!8510
|
| #
f73bfc44
|
| 15-Jul-2025 |
Satish Balay <balay@mcs.anl.gov> |
Merge remote-tracking branch 'origin/release'
|
| #
1d70d49e
|
| 11-Jul-2025 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'jczhang/2025-07-01/fix-MatCreateMPIAIJWithSeqAIJ-device' into 'release'
Fix bugs related to MatCreateMPIAIJWithSeqAIJ
See merge request petsc/petsc!8524
|
| #
d1c799ff
|
| 07-Jul-2025 |
Junchao Zhang <jczhang@anl.gov> |
aijkokkos: fix a bug, in MatAssemblyEnd_SeqAIJKokkos, before deleting aijkok, need to preserve a,i,j on host
|
| #
f3d3cd90
|
| 09-Jul-2025 |
Zach Atkins <Zach.Atkins@colorado.edu> |
Split KokkosDualViewSync into Host and Device versions
|
| #
afb41d4c
|
| 28-Mar-2025 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'jczhang/2025-03-18/revise-aijkokkos-matsolve' into 'main'
Add options to do factorization and solve on host for matseqaijkokkos
See merge request petsc/petsc!8209
|
| #
aac854ed
|
| 27-Mar-2025 |
Junchao Zhang <jczhang@anl.gov> |
aijkokkos: support factorization on host but solve on device
|
| #
23386071
|
| 20-Mar-2025 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'stevendargaville/mat-kokkoscreatedevice' into 'main'
Added new API calls to allow kokkos matrices to be built with no host preallocation
See merge request petsc/petsc!8206
|
| #
c0c276a7
|
| 20-Mar-2025 |
sdargavi <s.dargaville@imperial.ac.uk> |
Changes to allow building gpu matrices on the device.
Changed MatCreateMPIAIJWithSeqAIJ() so global sizes must be given to prevent reduction.
Changed MatCreateMPIAIJWithSeqAIJ() so B can be passed
Changes to allow building gpu matrices on the device.
Changed MatCreateMPIAIJWithSeqAIJ() so global sizes must be given to prevent reduction.
Changed MatCreateMPIAIJWithSeqAIJ() so B can be passed in with local indices, hence no compactification occurs.
Added MatCreateSeqAIJKokkosWithKokkosViews() which creates a Mat with no preallocation on the host.
Added test that checks building kokkos matrices without any host preallocation
show more ...
|
| #
97fff7b2
|
| 07-Mar-2025 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'jczhang/2025-03-05/revise-PetscGetKokkosExecutionSpace' into 'main'
Return execution space instead of reference to simplify the code
See merge request petsc/petsc!8182
|
| #
4df4a32c
|
| 07-Mar-2025 |
Junchao Zhang <jczhang@mcs.anl.gov> |
Kokkos: return execution space instead of reference to simplify the code
A Kokkos execution space is a shared pointer. We don't need to reference it
|
| #
b7b2c57c
|
| 05-Feb-2025 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'jczhang/2025-01-30/feature-support-AMD-MI300A' into 'main'
Add support of AMD MI300A
Closes #1703
See merge request petsc/petsc!8110
|
| #
45402d8a
|
| 30-Jan-2025 |
Junchao Zhang <jczhang@anl.gov> |
Kokkos: add support of AMD MI300A
* Use HostMirrorMemorySpace instead of HostSpace to fix compile errors on MI300A
* Replace Kokkos::HostSpace with HostMirrorMemorySpace to fix compile errors on MI
Kokkos: add support of AMD MI300A
* Use HostMirrorMemorySpace instead of HostSpace to fix compile errors on MI300A
* Replace Kokkos::HostSpace with HostMirrorMemorySpace to fix compile errors on MI300A, since the latter is what Kokkos::DualView use for its host view
* Fix a subtle bug in KokkosDualViewSync() w.r.t MI300A. Suppose we want to sync a petsc VecKokkos v on host. On MI300A, the host copy v_h and the device copy v_d share the memory. So in the old code, we used if (v_dual.need_sync_host()) to skip the device to host memory copy. But I should not skip the exec.fence(). As the device might still have kernels writing v_d, we still need to sync the device/stream to make v_d ready for use on CPU (via v_h).
show more ...
|
| #
9aa5e16c
|
| 21-Jun-2024 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'jczhang/2024-06-03/misc-gpu-stream-improve' into 'main'
Misc. GPU stream improvement
See merge request petsc/petsc!7614
|
| #
92896123
|
| 05-Jun-2024 |
Junchao Zhang <jczhang@anl.gov> |
Kokkos: try to always have an execution space argument
|
| #
314ab5fd
|
| 22-Dec-2023 |
Satish Balay <balay@mcs.anl.gov> |
Merge branch 'barry/2023-06-07/optimize-multivecs-zhang' into 'main'
Optimize VecMDot_Seq as suggested by Junchao Zhang using BLAS 2 gemv
See merge request petsc/petsc!6580
|
| #
e907feaa
|
| 19-Dec-2023 |
Junchao Zhang <jczhang@anl.gov> |
Vec: add GEMV optimizations for VecMDot and friends for VecKokkos
|