| 647a6db4 | 22-Feb-2019 |
Junchao Zhang <jczhang@mcs.anl.gov> |
Always use const data as key to look up links in SF
With this modification, we can have a VecScatter implementation that in VecScatterBegin(..,x,y,..), one only accesses the read-only vector x and d
Always use const data as key to look up links in SF
With this modification, we can have a VecScatter implementation that in VecScatterBegin(..,x,y,..), one only accesses the read-only vector x and does not access y - the vector to write. Access to y is pushed to VecScatterEnd(..,x,y,..).
The benefit is that we can get rid of VecGet/RestoreArray(y,..) in VecScatterBegin(). For vectors that are on GPU or not petsc native (e.g., VecNest), getting/restoring arrays are quite expensive.
In other words, we do lazy data access -- we only access data when we need it. This discussion also applies to PetscSFXxxBegin/End.
show more ...
|
| c943f53f | 13-Apr-2016 |
Jed Brown <jed@jedbrown.org> |
SFBasic: circumvent send-to-self by marking self as a distinguished group
The distinguished group is currently only self, but will later be the group of a shared memory communicator. The leaf buffe
SFBasic: circumvent send-to-self by marking self as a distinguished group
The distinguished group is currently only self, but will later be the group of a shared memory communicator. The leaf buffers for distinguished ranks are pointers directly into root buffers; MPI is not needed. When distinguished ranks are on different processes, we will need a weaker synchronization mechanism to ensure memory ordering.
show more ...
|
| bf39f1bf | 13-Aug-2017 |
Jed Brown <jed@jedbrown.org> |
SFBasic: pack into non-contiguous (by rank) root/leaf buffers instead of contiguous buffer
This is in preparation for self and shared memory communication, for which packed leaf and root data may us
SFBasic: pack into non-contiguous (by rank) root/leaf buffers instead of contiguous buffer
This is in preparation for self and shared memory communication, for which packed leaf and root data may use a common buffer. In that case, a single contiguous buffer for root and another for leaf data is not workable.
show more ...
|
| 21c688dc | 12-Apr-2016 |
Jed Brown <jed@jedbrown.org> |
SF: delay construction of rank mapping until SF type is set
To implement a shared memory optimization for PetscSF without ugly hacks that scale poorly, we need to be aware of the shared ranks when c
SF: delay construction of rank mapping until SF type is set
To implement a shared memory optimization for PetscSF without ugly hacks that scale poorly, we need to be aware of the shared ranks when constructing the rank mapping. With eager construction of rank mapping, this means discarding that or doing fairly complicated and memory-intense conversion. This commit delays that construction, but has no further behavioral change.
ex1: explicitly call PetscSFSetUp so that PetscSFView shows the full communication graph.
show more ...
|