| 45402d8a | 30-Jan-2025 |
Junchao Zhang <jczhang@anl.gov> |
Kokkos: add support of AMD MI300A
* Use HostMirrorMemorySpace instead of HostSpace to fix compile errors on MI300A
* Replace Kokkos::HostSpace with HostMirrorMemorySpace to fix compile errors on MI
Kokkos: add support of AMD MI300A
* Use HostMirrorMemorySpace instead of HostSpace to fix compile errors on MI300A
* Replace Kokkos::HostSpace with HostMirrorMemorySpace to fix compile errors on MI300A, since the latter is what Kokkos::DualView use for its host view
* Fix a subtle bug in KokkosDualViewSync() w.r.t MI300A. Suppose we want to sync a petsc VecKokkos v on host. On MI300A, the host copy v_h and the device copy v_d share the memory. So in the old code, we used if (v_dual.need_sync_host()) to skip the device to host memory copy. But I should not skip the exec.fence(). As the device might still have kernels writing v_d, we still need to sync the device/stream to make v_d ready for use on CPU (via v_h).
show more ...
|