admin管理员组文章数量:1406177
I'm trying to implement parallelization into a flowsolver code for my Phd, I've inherited a subroutine that is sending data between predefined subdomains. The subroutine is sending data throught the MPI_Isend command and receiving it with the MPI_Irecv command then calling a waitall.
(the offending code bellow:)
! -----------------------------------------------------------
! Definition of instant send/receive passings with barrier at the end
! -----------------------------------------------------------
spos=1 ! Position of the first element to send within send array
do i=1,isize ! loop over the number of exchanging segments
if (nsendseg(i).ne.0) then ! choose only domains with something to send
call MPI_ISend(send(spos),nsendseg(i),MPI_REAL8,i-1,1,MPI_COMM_WORLD,reqs(i),ierr)
spos=spos+nsendseg(i)
end if
enddo
rpos=1
do i=1,isize
if (nrecvseg(i).ne.0) then
call MPI_IRecv(recv(rpos),nrecvseg(i),MPI_REAL8,i-1,MPI_ANY_TAG,MPI_COMM_WORLD,reqs(i+sum(nsendseg)),ierr)
rpos=rpos+nrecvseg(i)
end if
end do
if (irank .eq. 0) print *, reqs
call MPI_Waitall(sum(nsendseg)+sum(nrecvseg),reqs,MPI_STATUSES_IGNORE,ierr)
EDIT CLARIFYING the sum(nsendseg)+sum(nrecvseg): I "believe" (i inherited this code from a former phd student who himself inherited it from another so theres some chinese whispers going on) that nsendseg represent the number of nodes that the segment (core) is going to send and to where. IE. running on 10 cores they are arrays of 10 integers representing the shared nodes between subdomains across cores Such that if segment 3 shares 12 nodes with segment 1 and 3 with segment 7 and none with any others nsendseg is (12,0,0,0,0,0,3,0,0,0). the number of nodes any segment recieves and sends is different because many segments can connect to one. The idea here is that each core iterates through a list of all other cores and sends and receives only the relevant data from each one.
This snippet of code aborts with copies of the error below across some or all nodes.
Abort(336210451) on node 13 (rank 13 in comm 0): Fatal error in PMPI_Waitall: Request pending due to failure, error stack:
PMPI_Waitall(352): MPI_Waitall(count=28734, req_array=0x18ac060, status_array=0x1) failed
PMPI_Waitall(328): The supplied request in array element 2 was invalid (kind=0)
My current idea behind whats wrong here is that the reqs array isn't having communication handles passed to it correctly. the bellow block of text is an example of the reqs array where this "feels" like the isend or irecv subroutines are trying to put an odd datatype in (reqs is an array of default integers).
0 -1409286132 0 0 -1409286133 -1409286135
-1409286134 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 -1409286131
0 0 -1409286130 -1409286129 -1409286128 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0
I know this is a bit of a shot in the dark because I'm basically asking random internet people to divine meaning in a piece of code someone who has long since moved on wrote.
Can anyone see the source of my error or alternatively inform me of what an mpi communication handle should look like or any other sage advice it would be greatly appreciated. <3
I'm trying to implement parallelization into a flowsolver code for my Phd, I've inherited a subroutine that is sending data between predefined subdomains. The subroutine is sending data throught the MPI_Isend command and receiving it with the MPI_Irecv command then calling a waitall.
(the offending code bellow:)
! -----------------------------------------------------------
! Definition of instant send/receive passings with barrier at the end
! -----------------------------------------------------------
spos=1 ! Position of the first element to send within send array
do i=1,isize ! loop over the number of exchanging segments
if (nsendseg(i).ne.0) then ! choose only domains with something to send
call MPI_ISend(send(spos),nsendseg(i),MPI_REAL8,i-1,1,MPI_COMM_WORLD,reqs(i),ierr)
spos=spos+nsendseg(i)
end if
enddo
rpos=1
do i=1,isize
if (nrecvseg(i).ne.0) then
call MPI_IRecv(recv(rpos),nrecvseg(i),MPI_REAL8,i-1,MPI_ANY_TAG,MPI_COMM_WORLD,reqs(i+sum(nsendseg)),ierr)
rpos=rpos+nrecvseg(i)
end if
end do
if (irank .eq. 0) print *, reqs
call MPI_Waitall(sum(nsendseg)+sum(nrecvseg),reqs,MPI_STATUSES_IGNORE,ierr)
EDIT CLARIFYING the sum(nsendseg)+sum(nrecvseg): I "believe" (i inherited this code from a former phd student who himself inherited it from another so theres some chinese whispers going on) that nsendseg represent the number of nodes that the segment (core) is going to send and to where. IE. running on 10 cores they are arrays of 10 integers representing the shared nodes between subdomains across cores Such that if segment 3 shares 12 nodes with segment 1 and 3 with segment 7 and none with any others nsendseg is (12,0,0,0,0,0,3,0,0,0). the number of nodes any segment recieves and sends is different because many segments can connect to one. The idea here is that each core iterates through a list of all other cores and sends and receives only the relevant data from each one.
This snippet of code aborts with copies of the error below across some or all nodes.
Abort(336210451) on node 13 (rank 13 in comm 0): Fatal error in PMPI_Waitall: Request pending due to failure, error stack:
PMPI_Waitall(352): MPI_Waitall(count=28734, req_array=0x18ac060, status_array=0x1) failed
PMPI_Waitall(328): The supplied request in array element 2 was invalid (kind=0)
My current idea behind whats wrong here is that the reqs array isn't having communication handles passed to it correctly. the bellow block of text is an example of the reqs array where this "feels" like the isend or irecv subroutines are trying to put an odd datatype in (reqs is an array of default integers).
0 -1409286132 0 0 -1409286133 -1409286135
-1409286134 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 -1409286131
0 0 -1409286130 -1409286129 -1409286128 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0
I know this is a bit of a shot in the dark because I'm basically asking random internet people to divine meaning in a piece of code someone who has long since moved on wrote.
Can anyone see the source of my error or alternatively inform me of what an mpi communication handle should look like or any other sage advice it would be greatly appreciated. <3
Share Improve this question edited Mar 7 at 6:28 Ian Bush 7,5371 gold badge23 silver badges29 bronze badges asked Mar 6 at 17:02 Subject303Subject303 33 bronze badges 7 | Show 2 more comments1 Answer
Reset to default 1You need to initialize reqs = MPI_REQUEST_NULL
before the loop.
Waiting on null requests is valid and immediately succeeds. It is fine to have null requests in the array passed to waitall. The calculation for the number of requests seems strange. You didn't show the size of reqs. It should be of size 2*isize for the following solution:
integer,dimension(2*isize) :: reqs
! -----------------------------------------------------------
! Definition of instant send/receive passings with barrier at the end
! -----------------------------------------------------------
spos=1 ! Position of the first element to send within send array
reqs = MPI_REQUEST_NULL
do i=1,isize ! loop over the number of exchanging segments
if (nsendseg(i).ne.0) then ! choose only domains with something to send
call MPI_ISend(send(spos),nsendseg(i),MPI_REAL8,i-1,1,MPI_COMM_WORLD,reqs(i),ierr)
spos=spos+nsendseg(i)
end if
enddo
rpos=1
do i=1,isize
if (nrecvseg(i).ne.0) then
call MPI_IRecv(recv(rpos),nrecvseg(i),MPI_REAL8,i-1,MPI_ANY_TAG,MPI_COMM_WORLD,reqs(i+isize),ierr)
rpos=rpos+nrecvseg(i)
end if
end do
call MPI_Waitall(2*isize,reqs,MPI_STATUSES_IGNORE,ierr)
The solution without holes in the array of requests (the upper bound for number of requests is still 2*isize):
integer,dimension(2*isize) :: reqs
integer :: ireq
! -----------------------------------------------------------
! Definition of instant send/receive passings with barrier at the end
! -----------------------------------------------------------
spos=1 ! Position of the first element to send within send array
ireq=1
reqs = MPI_REQUEST_NULL
do i=1,isize ! loop over the number of exchanging segments
if (nsendseg(i).ne.0) then ! choose only domains with something to send
call MPI_ISend(send(spos),nsendseg(i),MPI_REAL8,i-1,1,MPI_COMM_WORLD,reqs(ireq),ierr)
spos=spos+nsendseg(i)
ireq=ireq+1
end if
enddo
rpos=1
do i=1,isize
if (nrecvseg(i).ne.0) then
call MPI_IRecv(recv(rpos),nrecvseg(i),MPI_REAL8,i-1,MPI_ANY_TAG,MPI_COMM_WORLD,reqs(ireq),ierr)
rpos=rpos+nrecvseg(i)
ireq=ireq+1
end if
end do
call MPI_Waitall(ireq,reqs,MPI_STATUSES_IGNORE,ierr)
本文标签:
版权声明:本文标题:fortran - 6MPI waitall error "The supplied request in array element 0 was invalid (kind=0)" - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1744960354a2634616.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
nsendseg
andnrecvseg
hold? – Ian Bush Commented Mar 6 at 19:34call MPI_Waitall(sum(nsendseg)+sum(nrecvseg)...
looks very strange, the first argument should be the number of requests upon which you are waiting, not the amount of data being sent about. – Ian Bush Commented Mar 6 at 19:46sum(nsendseg)+sum(nrecvseg)
is actually an upper bound for the number of requests used, so in some way it makes sense. It doesn't work in combination with usingi
/sum(nsendseg)+i
as index into the array and would only be a valid upper bound when using it with the dense use ofreqs
as in my second code snippet. – Joachim Commented Mar 7 at 7:55