1*10167291SKenneth E. Jansen subroutine commuInt (global, ilwork, n, code) 2*10167291SKenneth E. Jansenc--------------------------------------------------------------------- 3*10167291SKenneth E. Jansenc 4*10167291SKenneth E. Jansenc This subroutine is responsible for interprocessor communication of 5*10167291SKenneth E. Jansenc the residual and solution vectors. 6*10167291SKenneth E. Jansenc 7*10167291SKenneth E. Jansenc input: 8*10167291SKenneth E. Jansenc global(nshg,n): global vector to be communicated. Note that 9*10167291SKenneth E. Jansenc this vector is local to the processor, (i.e. 10*10167291SKenneth E. Jansenc not distributed across processors) 11*10167291SKenneth E. Jansenc ilwork(nlwork): this is the local interprocessor work array. 12*10167291SKenneth E. Jansenc This array is local to the processor, (i.e. 13*10167291SKenneth E. Jansenc each processor has a unique ilwork array. 14*10167291SKenneth E. Jansenc n: second dimension of the array to be communicated 15*10167291SKenneth E. Jansenc code: = 'in' for communicating with the residual 16*10167291SKenneth E. Jansenc = 'out' for cummunicating the solution 17*10167291SKenneth E. Jansenc 18*10167291SKenneth E. Jansenc--------------------------------------------------------------------- 19*10167291SKenneth E. Jansenc 20*10167291SKenneth E. Jansenc The array ilwork describes the details of the communications. 21*10167291SKenneth E. Jansenc Each communication step (call of this routine) consists of a 22*10167291SKenneth E. Jansenc sequence of "tasks", where a task is defined as a communication 23*10167291SKenneth E. Jansenc between two processors where data is exchanged. This would imply 24*10167291SKenneth E. Jansenc that for a given processor, there will be as many tasks as there 25*10167291SKenneth E. Jansenc are processors with which it must communicate. Details of the 26*10167291SKenneth E. Jansenc ilwork array appear below. 27*10167291SKenneth E. Jansenc 28*10167291SKenneth E. Jansenc--------------------------------------------------------------------- 29*10167291SKenneth E. Jansenc 30*10167291SKenneth E. Jansen include "common.h" 31*10167291SKenneth E. Jansen include "mpif.h" 32*10167291SKenneth E. Jansen include "auxmpi.h" 33*10167291SKenneth E. Jansen integer status(MPI_STATUS_SIZE) 34*10167291SKenneth E. Jansen integer stat(MPI_STATUS_SIZE, 2*maxtask), req(2*maxtask) 35*10167291SKenneth E. Jansen 36*10167291SKenneth E. Jansen integer*8 global(nshg,n), 37*10167291SKenneth E. Jansen & rtemp(maxfront*n,maxtask) 38*10167291SKenneth E. Jansen integer ilwork(nlwork) 39*10167291SKenneth E. Jansen 40*10167291SKenneth E. Jansen character*3 code 41*10167291SKenneth E. Jansen 42*10167291SKenneth E. Jansen 43*10167291SKenneth E. Jansen if (code .ne. 'in ' .and. code .ne. 'out') 44*10167291SKenneth E. Jansen & call error ('commuInt','code ',0) 45*10167291SKenneth E. Jansen 46*10167291SKenneth E. Jansen if (n .eq. 1) then ! like a scalar 47*10167291SKenneth E. Jansen kdof = 16 48*10167291SKenneth E. Jansen else 49*10167291SKenneth E. Jansen call error ('commuInt','n ',n) 50*10167291SKenneth E. Jansen endif 51*10167291SKenneth E. Jansenc return 52*10167291SKenneth E. Jansenc... Note that when adding another kdof to the above set, we must 53*10167291SKenneth E. Jansenc... also make changes in ctypes.f and auxmpi.h 54*10167291SKenneth E. Jansen 55*10167291SKenneth E. Jansenc--------------------------------------------------------------------- 56*10167291SKenneth E. Jansenc ilwork(1): number of tasks 57*10167291SKenneth E. Jansenc 58*10167291SKenneth E. Jansenc The following information is contained in ilwork for each task: 59*10167291SKenneth E. Jansenc itag: tag of the communication 60*10167291SKenneth E. Jansenc iacc: == 0 if task is a send 61*10167291SKenneth E. Jansenc == 1 if task is a recieve 62*10167291SKenneth E. Jansenc iother: rank of processor with which this communication occurs 63*10167291SKenneth E. Jansenc numseg: number of data "segments" to be sent or recieved. A 64*10167291SKenneth E. Jansenc segment is defined as a continuous section of the global 65*10167291SKenneth E. Jansenc vector to be communicated, (i.e. a group of nodes (or, 66*10167291SKenneth E. Jansenc rather, "shape function coefficients") which occur 67*10167291SKenneth E. Jansenc sequentially in the array global(nshg,n)). 68*10167291SKenneth E. Jansenc isbeg: location of the first segment in the array owned by the 69*10167291SKenneth E. Jansenc current processor. 70*10167291SKenneth E. Jansenc 71*10167291SKenneth E. Jansenc The two types of communication are 'in', where the residual is being 72*10167291SKenneth E. Jansenc communicated, and 'out', where the solution is being communicated. 73*10167291SKenneth E. Jansenc Note that when the type is 'out', senders recieve and recievers send. 74*10167291SKenneth E. Jansenc 75*10167291SKenneth E. Jansenc The following comment pertains to a communication of type 'in': 76*10167291SKenneth E. Jansenc 77*10167291SKenneth E. Jansenc If the task is a send, then all of the numseg segments are 78*10167291SKenneth E. Jansenc sent with a single call to MPI_SEND. Where these segments live in 79*10167291SKenneth E. Jansenc the array is built into the array sevsegtype, which is a common 80*10167291SKenneth E. Jansenc array constructed in the subroutine "ctypes.f". In other words, 81*10167291SKenneth E. Jansenc sevsegtype is a data type that describes the indices of the blocks 82*10167291SKenneth E. Jansenc to be sent, in terms of there beginning index, and the length of 83*10167291SKenneth E. Jansenc each segment. Using this, we can make a single send to take care of 84*10167291SKenneth E. Jansenc all the segments for this task. 85*10167291SKenneth E. Jansenc 86*10167291SKenneth E. Jansenc If the task is a recieve, then once the vector is recieved, the 87*10167291SKenneth E. Jansenc recieved segments must be added to the correct locations in the 88*10167291SKenneth E. Jansenc current array. These locations are described in ilwork as the 89*10167291SKenneth E. Jansenc beginning position, then the length of the segment. 90*10167291SKenneth E. Jansenc 91*10167291SKenneth E. Jansenc--------------------------------------------------------------------- 92*10167291SKenneth E. Jansen numtask = ilwork(1) 93*10167291SKenneth E. Jansen 94*10167291SKenneth E. Jansen itkbeg = 1 95*10167291SKenneth E. Jansen m = 0 96*10167291SKenneth E. Jansen idl=0 97*10167291SKenneth E. Jansen 98*10167291SKenneth E. Jansen DO itask = 1, numtask 99*10167291SKenneth E. Jansen m = m + 1 100*10167291SKenneth E. Jansen itag = ilwork (itkbeg + 1) 101*10167291SKenneth E. Jansen iacc = ilwork (itkbeg + 2) 102*10167291SKenneth E. Jansen iother = ilwork (itkbeg + 3) 103*10167291SKenneth E. Jansen numseg = ilwork (itkbeg + 4) 104*10167291SKenneth E. Jansen isgbeg = ilwork (itkbeg + 5) 105*10167291SKenneth E. Jansenc 106*10167291SKenneth E. Jansenc.... if iacc == 0, then this task is a send. 107*10167291SKenneth E. Jansenc slave 108*10167291SKenneth E. Jansenc 109*10167291SKenneth E. Jansen if (iacc .EQ. 0) then 110*10167291SKenneth E. Jansenc 111*10167291SKenneth E. Jansenc.... residual communication 112*10167291SKenneth E. Jansenc 113*10167291SKenneth E. Jansen if (code .eq. 'in ') then 114*10167291SKenneth E. Jansen call MPI_ISEND(global(isgbeg, 1), 1, sevsegtype(itask,kdof), 115*10167291SKenneth E. Jansen & iother, itag, MPI_COMM_WORLD, req(m), ierr) 116*10167291SKenneth E. Jansen endif 117*10167291SKenneth E. Jansenc 118*10167291SKenneth E. Jansenc.... solution communication 119*10167291SKenneth E. Jansenc 120*10167291SKenneth E. Jansen if (code .eq. 'out') then 121*10167291SKenneth E. Jansen call MPI_IRECV(global(isgbeg, 1), 1, sevsegtype(itask,kdof), 122*10167291SKenneth E. Jansen & iother, itag, MPI_COMM_WORLD, req(m), ierr) 123*10167291SKenneth E. Jansenc call MPI_RECV(global(isgbeg,1), 1, sevsegtype(itask,kdof), 124*10167291SKenneth E. Jansenc & iother, itag, MPI_COMM_WORLD, status, ierr) 125*10167291SKenneth E. Jansen endif 126*10167291SKenneth E. Jansenc 127*10167291SKenneth E. Jansenc.... if iacc == 1, then this task is a recieve. 128*10167291SKenneth E. Jansenc master 129*10167291SKenneth E. Jansenc 130*10167291SKenneth E. Jansen else 131*10167291SKenneth E. Jansen if (code .eq. 'in ') then 132*10167291SKenneth E. Jansenc 133*10167291SKenneth E. Jansenc.... determine the number of total number of nodes involved in this 134*10167291SKenneth E. Jansenc communication (lfront), including all segments 135*10167291SKenneth E. Jansenc 136*10167291SKenneth E. Jansen lfront = 0 137*10167291SKenneth E. Jansen do is = 1,numseg 138*10167291SKenneth E. Jansen lenseg = ilwork (itkbeg + 4 + 2*is) 139*10167291SKenneth E. Jansen lfront = lfront + lenseg 140*10167291SKenneth E. Jansen enddo 141*10167291SKenneth E. Jansenc 142*10167291SKenneth E. Jansenc.... recieve all segments for this task in a single step 143*10167291SKenneth E. Jansenc 144*10167291SKenneth E. Jansen idl=idl+1 ! stands for i Do Later, the number to fix later 145*10167291SKenneth E. Jansen call MPI_IRECV(rtemp(1,idl), lfront*n, MPI_INTEGER, 146*10167291SKenneth E. Jansen & iother, itag, MPI_COMM_WORLD, req(m), ierr) 147*10167291SKenneth E. Jansen endif 148*10167291SKenneth E. Jansen if (code .eq. 'out') then 149*10167291SKenneth E. Jansen call MPI_ISEND(global(isgbeg, 1), 1, sevsegtype(itask,kdof), 150*10167291SKenneth E. Jansen & iother, itag, MPI_COMM_WORLD, req(m), ierr) 151*10167291SKenneth E. Jansen endif 152*10167291SKenneth E. Jansen endif 153*10167291SKenneth E. Jansen 154*10167291SKenneth E. Jansen itkbeg = itkbeg + 4 + 2*numseg 155*10167291SKenneth E. Jansen 156*10167291SKenneth E. Jansen enddo !! end tasks loop 157*10167291SKenneth E. Jansen 158*10167291SKenneth E. Jansen call MPI_WAITALL(m, req, stat, ierr) 159*10167291SKenneth E. Jansen 160*10167291SKenneth E. Jansenc 161*10167291SKenneth E. Jansenc Stuff added below is a delayed assembly of that which was communicated 162*10167291SKenneth E. Jansenc above but due to the switch to non-blocking receivves could not be 163*10167291SKenneth E. Jansenc assembled until after the waitall. Only necessary for commu "in" 164*10167291SKenneth E. Jansenc 165*10167291SKenneth E. Jansen 166*10167291SKenneth E. Jansen if(code .eq. 'in ') then 167*10167291SKenneth E. Jansen itkbeg=1 168*10167291SKenneth E. Jansen jdl=0 169*10167291SKenneth E. Jansen do j=1,numtask ! time to do all the segments that needed to be 170*10167291SKenneth E. Jansen ! assembled into the global vector 171*10167291SKenneth E. Jansen 172*10167291SKenneth E. Jansen iacc = ilwork (itkbeg + 2) 173*10167291SKenneth E. Jansen numseg = ilwork (itkbeg + 4) 174*10167291SKenneth E. Jansen isgbeg = ilwork (itkbeg + 5) 175*10167291SKenneth E. Jansen if(iacc.eq.1) then 176*10167291SKenneth E. Jansen jdl=jdl+1 ! keep track of order of rtemp's 177*10167291SKenneth E. Jansenc 178*10167291SKenneth E. Jansenc... add the recieved data to the global array on the current processor. 179*10167291SKenneth E. Jansenc Note that this involves splitting up the chunk of recieved data 180*10167291SKenneth E. Jansenc into its correct segment locations for the current processor. 181*10167291SKenneth E. Jansenc 182*10167291SKenneth E. Jansen itemp = 1 183*10167291SKenneth E. Jansen do idof = 1,n 184*10167291SKenneth E. Jansen do is = 1,numseg 185*10167291SKenneth E. Jansen isgbeg = ilwork (itkbeg + 3 + 2*is) 186*10167291SKenneth E. Jansen lenseg = ilwork (itkbeg + 4 + 2*is) 187*10167291SKenneth E. Jansen isgend = isgbeg + lenseg - 1 188*10167291SKenneth E. Jansen global(isgbeg:isgend,idof) = global(isgbeg:isgend,idof) 189*10167291SKenneth E. Jansen & + rtemp (itemp:itemp+lenseg-1,jdl) 190*10167291SKenneth E. Jansen itemp = itemp + lenseg 191*10167291SKenneth E. Jansen enddo 192*10167291SKenneth E. Jansen enddo 193*10167291SKenneth E. Jansen endif ! end of receive (iacc=1) 194*10167291SKenneth E. Jansen itkbeg = itkbeg + 4 + 2*numseg 195*10167291SKenneth E. Jansen enddo 196*10167291SKenneth E. Jansen endif ! commu "in" 197*10167291SKenneth E. Jansen return 198*10167291SKenneth E. Jansen end 199*10167291SKenneth E. Jansen 200*10167291SKenneth E. Jansen 201*10167291SKenneth E. Jansen 202