Lines Matching refs:memory
9 Most algorithms in PETSc are memory
10 …a simulation depends more on the total achievable [^achievable-footnote] memory bandwidth of the c…
12 … for gaining insights into parallel performance (scaling) by measuring achievable memory bandwidth.
22 …REAMS measures the total memory bandwidth achievable when running `n` independent threads or proce…
23 `N` on a shared memory node.
26 Though real simulations have more complex memory access patterns, most computations for PDEs have l…
27 independent non-overlapping memory STREAMS model still provides useful information.
30 …ficiency) obtained on a given system indicates the likely performance of memory bandwidth-limited …
32 Fig. {any}`fig_gcc_streams` plots the total memory bandwidth achieved and the speedup for runs on a…
43 There are three important concepts needed to understand memory bandwidth-limited computing.
45 - Thread or process **binding** to hardware subsets of the shared memory node. The Unix operating s…
49 …t) to hardware subsets when more threads or processes are used. Physical memory is divided into mu…
50 …independently provide a certain memory bandwidth. Different cores may be more closely connected to…
51 …orm memory access (**NUMA**), meaning the memory latency or bandwidth for any particular core depe…
53 or process to use a different memory unit
55 …ead or process to cores that do not share the previously assigned core's memory unit ensures a hig…
56 … that each thread or process **uses data on the closest memory unit**. The OS selects the memory u…
57 of virtual memory based on **first touch**:
58 …e first thread or process to touch (read or write to) a memory address determines to which memory …
292 …e second communication mechanism is Unix shared memory `shmget()`. Here, `PCMPI` allocates shared …
321 … predict the speedup (parallel efficiency) of a memory bandwidth limited application** on a shared…
323 For the Apple M2, we present the results using Unix shared-memory communication of the matrix and v…
325 To run this one must first set up the machine to use shared memory as described in `PetscShmgetAllo…
335 memory bandwidth. However, one should not expect the speedup to be near the total number of cores o…
340 [^achievable-footnote]: Achievable memory bandwidth is the actual bandwidth one can obtain
343 [^memorymigration-footnote]: Data can also be migrated among different memory sockets during a comp…