fortran - 6MPI waitall error "The supplied request in array element 0 was invalid (kind=0)" - Stack Overflow-软件玩家

admin管理员组
文章数量:1406177

I'm trying to implement parallelization into a flowsolver code for my Phd, I've inherited a subroutine that is sending data between predefined subdomains. The subroutine is sending data throught the MPI_Isend command and receiving it with the MPI_Irecv command then calling a waitall.

(the offending code bellow:)

        ! -----------------------------------------------------------
        ! Definition of instant send/receive passings with barrier at the end
        ! -----------------------------------------------------------

        spos=1                          ! Position of the first element to send within send array
        do i=1,isize                    ! loop over the number of exchanging segments
            if (nsendseg(i).ne.0) then  ! choose only domains with something to send
                call MPI_ISend(send(spos),nsendseg(i),MPI_REAL8,i-1,1,MPI_COMM_WORLD,reqs(i),ierr)
                spos=spos+nsendseg(i)
            end if
        enddo
    
        rpos=1
        do i=1,isize
            if (nrecvseg(i).ne.0) then
                call MPI_IRecv(recv(rpos),nrecvseg(i),MPI_REAL8,i-1,MPI_ANY_TAG,MPI_COMM_WORLD,reqs(i+sum(nsendseg)),ierr)
                rpos=rpos+nrecvseg(i)
            end if
        end do
        
        if (irank .eq. 0) print *, reqs
        
        call MPI_Waitall(sum(nsendseg)+sum(nrecvseg),reqs,MPI_STATUSES_IGNORE,ierr)

EDIT CLARIFYING the sum(nsendseg)+sum(nrecvseg): I "believe" (i inherited this code from a former phd student who himself inherited it from another so theres some chinese whispers going on) that nsendseg represent the number of nodes that the segment (core) is going to send and to where. IE. running on 10 cores they are arrays of 10 integers representing the shared nodes between subdomains across cores Such that if segment 3 shares 12 nodes with segment 1 and 3 with segment 7 and none with any others nsendseg is (12,0,0,0,0,0,3,0,0,0). the number of nodes any segment recieves and sends is different because many segments can connect to one. The idea here is that each core iterates through a list of all other cores and sends and receives only the relevant data from each one.

This snippet of code aborts with copies of the error below across some or all nodes.

Abort(336210451) on node 13 (rank 13 in comm 0): Fatal error in PMPI_Waitall: Request pending due to failure, error stack:
PMPI_Waitall(352): MPI_Waitall(count=28734, req_array=0x18ac060, status_array=0x1) failed
PMPI_Waitall(328): The supplied request in array element 2 was invalid (kind=0)

My current idea behind whats wrong here is that the reqs array isn't having communication handles passed to it correctly. the bellow block of text is an example of the reqs array where this "feels" like the isend or irecv subroutines are trying to put an odd datatype in (reqs is an array of default integers).

           0 -1409286132           0           0 -1409286133 -1409286135
 -1409286134           0           0           0           0           0
           0           0           0           0           0           0
           0           0           0           0           0           0
           0           0           0           0           0           0
           0           0           0           0           0           0
           0           0           0           0           0           0
           0           0           0           0           0           0
           0           0           0           0           0           0
           0           0           0           0           0           0
           0           0           0           0           0 -1409286131
           0           0 -1409286130 -1409286129 -1409286128           0
           0           0           0           0           0           0
           0           0           0           0           0           0
           0           0           0           0           0           0
           0           0           0           0           0           0
           0           0           0           0           0           0
           0           0           0           0           0           0
           0           0           0           0           0           0
           0           0           0           0           0           0
           0           0           0           0           0           0
           0           0

I know this is a bit of a shot in the dark because I'm basically asking random internet people to divine meaning in a piece of code someone who has long since moved on wrote.

Can anyone see the source of my error or alternatively inform me of what an mpi communication handle should look like or any other sage advice it would be greatly appreciated. <3

(the offending code bellow:)

        ! -----------------------------------------------------------
        ! Definition of instant send/receive passings with barrier at the end
        ! -----------------------------------------------------------

        spos=1                          ! Position of the first element to send within send array
        do i=1,isize                    ! loop over the number of exchanging segments
            if (nsendseg(i).ne.0) then  ! choose only domains with something to send
                call MPI_ISend(send(spos),nsendseg(i),MPI_REAL8,i-1,1,MPI_COMM_WORLD,reqs(i),ierr)
                spos=spos+nsendseg(i)
            end if
        enddo
    
        rpos=1
        do i=1,isize
            if (nrecvseg(i).ne.0) then
                call MPI_IRecv(recv(rpos),nrecvseg(i),MPI_REAL8,i-1,MPI_ANY_TAG,MPI_COMM_WORLD,reqs(i+sum(nsendseg)),ierr)
                rpos=rpos+nrecvseg(i)
            end if
        end do
        
        if (irank .eq. 0) print *, reqs
        
        call MPI_Waitall(sum(nsendseg)+sum(nrecvseg),reqs,MPI_STATUSES_IGNORE,ierr)

This snippet of code aborts with copies of the error below across some or all nodes.

Abort(336210451) on node 13 (rank 13 in comm 0): Fatal error in PMPI_Waitall: Request pending due to failure, error stack:
PMPI_Waitall(352): MPI_Waitall(count=28734, req_array=0x18ac060, status_array=0x1) failed
PMPI_Waitall(328): The supplied request in array element 2 was invalid (kind=0)

           0 -1409286132           0           0 -1409286133 -1409286135
 -1409286134           0           0           0           0           0
           0           0           0           0           0           0
           0           0           0           0           0           0
           0           0           0           0           0           0
           0           0           0           0           0           0
           0           0           0           0           0           0
           0           0           0           0           0           0
           0           0           0           0           0           0
           0           0           0           0           0           0
           0           0           0           0           0 -1409286131
           0           0 -1409286130 -1409286129 -1409286128           0
           0           0           0           0           0           0
           0           0           0           0           0           0
           0           0           0           0           0           0
           0           0           0           0           0           0
           0           0           0           0           0           0
           0           0           0           0           0           0
           0           0           0           0           0           0
           0           0           0           0           0           0
           0           0           0           0           0           0
           0           0

I know this is a bit of a shot in the dark because I'm basically asking random internet people to divine meaning in a piece of code someone who has long since moved on wrote.

Can anyone see the source of my error or alternatively inform me of what an mpi communication handle should look like or any other sage advice it would be greatly appreciated. <3

Share Improve this question edited Mar 7 at 6:28 Ian Bush 7,5371 gold badge23 silver badges29 bronze badges asked Mar 6 at 17:02 Subject303 33 bronze badges

On bus so can't answer properly but note your reqs array will have "holes" and so be unsuitable for waitall if you skip sending or recving any messages. You need a separate indexing variable for the reqd – Ian Bush Commented Mar 6 at 18:30
Could you clarify exactly what nsendseg and nrecvseg hold? – Ian Bush Commented Mar 6 at 19:34
@IanBush I "believe" (i inherited this code from a former phd student who himself inherited it from another so theres some chinese whispers going on) that nsendseg represent the number of nodes that the segment (core) is going to send and to where. IE. running on 10 cores they are arrays of 10 integers representing the shared nodes between subdomains across cores Such that if segment 3 shares 12 nodes with segment 1 and 3 with segment 7 and none with any others nsendseg is (12,0,0,0,0,0,3,0,0,0). the number of nodes any segment recieves and sends is different because many segments can conn – Subject303 Commented Mar 6 at 19:37
Could you edit the equation to include this clarification, please? I'll try to add my own answer, most likely tomorrow, but with this info call MPI_Waitall(sum(nsendseg)+sum(nrecvseg)... looks very strange, the first argument should be the number of requests upon which you are waiting, not the amount of data being sent about. – Ian Bush Commented Mar 6 at 19:46
1 @GillesGouaillardet sum(nsendseg)+sum(nrecvseg) is actually an upper bound for the number of requests used, so in some way it makes sense. It doesn't work in combination with using i / sum(nsendseg)+i as index into the array and would only be a valid upper bound when using it with the dense use of reqs as in my second code snippet. – Joachim Commented Mar 7 at 7:55

| Show 2 more comments

1 Answer 1

Sorted by: Reset to default 1

You need to initialize reqs = MPI_REQUEST_NULL before the loop.

Waiting on null requests is valid and immediately succeeds. It is fine to have null requests in the array passed to waitall. The calculation for the number of requests seems strange. You didn't show the size of reqs. It should be of size 2*isize for the following solution:

        integer,dimension(2*isize) :: reqs

        ! -----------------------------------------------------------
        ! Definition of instant send/receive passings with barrier at the end
        ! -----------------------------------------------------------

        spos=1                          ! Position of the first element to send within send array
        reqs = MPI_REQUEST_NULL
        do i=1,isize                    ! loop over the number of exchanging segments
            if (nsendseg(i).ne.0) then  ! choose only domains with something to send
                call MPI_ISend(send(spos),nsendseg(i),MPI_REAL8,i-1,1,MPI_COMM_WORLD,reqs(i),ierr)
                spos=spos+nsendseg(i)
            end if
        enddo
    
        rpos=1
        do i=1,isize
            if (nrecvseg(i).ne.0) then
                call MPI_IRecv(recv(rpos),nrecvseg(i),MPI_REAL8,i-1,MPI_ANY_TAG,MPI_COMM_WORLD,reqs(i+isize),ierr)
                rpos=rpos+nrecvseg(i)
            end if
        end do
        
        call MPI_Waitall(2*isize,reqs,MPI_STATUSES_IGNORE,ierr)

The solution without holes in the array of requests (the upper bound for number of requests is still 2*isize):

        integer,dimension(2*isize) :: reqs
        integer :: ireq

        ! -----------------------------------------------------------
        ! Definition of instant send/receive passings with barrier at the end
        ! -----------------------------------------------------------

        spos=1                          ! Position of the first element to send within send array
        ireq=1
        reqs = MPI_REQUEST_NULL
        do i=1,isize                    ! loop over the number of exchanging segments
            if (nsendseg(i).ne.0) then  ! choose only domains with something to send
                call MPI_ISend(send(spos),nsendseg(i),MPI_REAL8,i-1,1,MPI_COMM_WORLD,reqs(ireq),ierr)
                spos=spos+nsendseg(i)
                ireq=ireq+1
            end if
        enddo
    
        rpos=1
        do i=1,isize
            if (nrecvseg(i).ne.0) then
                call MPI_IRecv(recv(rpos),nrecvseg(i),MPI_REAL8,i-1,MPI_ANY_TAG,MPI_COMM_WORLD,reqs(ireq),ierr)
                rpos=rpos+nrecvseg(i)
                ireq=ireq+1
            end if
        end do
        
        call MPI_Waitall(ireq,reqs,MPI_STATUSES_IGNORE,ierr)

本文标签：

版权声明：本文标题：fortran - 6MPI waitall error "The supplied request in array element 0 was invalid (kind=0)" - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1744960354a2634616.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

fortran - 6MPI waitall error &quot;The supplied request in array element 0 was invalid (kind=0)&quot; - Stack Overflow

1 Answer 1

更多相关文章

Linux : pipewire: pro-audio : How can I determine states of headphone and lineout? - Stack Overflow

javascript - jQuery up one level and .find() - Stack Overflow

javascript - How do I properly push JSON array onto existing data array in Vue JS on a Axios response? - Stack Overflow

javascript - Create plugin gulp with stream - Stack Overflow

Oracle 19c Why we can&#39;t loop through ORA-01013 exception (when user cancelled sql) - Stack Overflow

javascript - angular 5 - add class only if element exists in array - Stack Overflow

angular - [routerLink]=&quot;&quot; VS href=&quot;javascript:void(0);&quot; - Stack Overflow

Javascript unexpected token dot when pushing object to the array - Stack Overflow

plugin development - Updating transient value frequently

javascript - onBlur event in IE - Stack Overflow

python - Fintech OS query Payload - Stack Overflow

php - How do I make the following articles collapse within the month?

javascript - &quot;search&quot; field to filter content - Stack Overflow

build - Do you have to (ASDF:load-system :xxxx) and (ql:quickload :yyyy) everytime you want to work? - Stack Overflow

How to Implement an Audio Playback Progress Slider in .NET MAUI? - Stack Overflow

the content - Undefined variable error in new function

javascript - can&#39;t resolve getaddrinfo EAI_AGAIN error - Stack Overflow

javascript - Proper way of creating date fields in jsGrid - Stack Overflow

javascript - Phonegap Camera API - Cannot read property &#39;DATA_URL&#39; of undefined - Stack Overflow

javascript - Toggle data display inside table cell - Stack Overflow

发表评论

推荐文章

javascript - Unhandled Promise Rejection Warning - Node.js - Stack Overflow

html-javascript- object array - Stack Overflow

code golf - Injecting a javascript file with the fewest possible characters? - Stack Overflow

llvm - How to get Hello Word compiling from Swift to JavaScript using Emscripten - Stack Overflow

fortinet - FortiGate VM is losing Serial Number after uploading Evaluation license - Stack Overflow

热门文章

Javascript to Java using JSON - Stack Overflow

uhd - Problem when working with USRP 200b mini in TX 56 megasamplessec - Stack Overflow

javascript - Angularjs Multiple Select: Cannot read property &#39;length&#39; of undefined - Stack Overflow

javascript - jQuery Week Selector - Stack Overflow

javascript - Discord.js permission - Stack Overflow

javascript - Typescript generics for array - Stack Overflow

javascript - TypeScript pattern to pass options while creating singleton instance - Stack Overflow

javascript - Adding an element in negative index of an array? - Stack Overflow

javascript - Redirect to any page and submit form details - Stack Overflow

firebase - Flutter android app get google ads campaign details progamatically - Stack Overflow

最新文章

windows设置断电重启开机后自动输入锁屏密码登录

Windows系统设置开机默认开启数字小键盘

Windows11 开机自动同步时间（开机时间不更新问题）

windows配置开机自启动软件或脚本

【Redis】Windows设置Redis为开机自启动

Rails 5 Upgrade- request.fullpath not returning the same value as Rails 4 for Controller Tests - Stack Overflow

javascript - Jquery validation plugin and switching required field - Stack Overflow

javascript - Scroll to top of jQuery-Textarea - Stack Overflow

Why do some Looker dashboards take a long time to load the page (pre-DB) while others are very fast? - Stack Overflow

plugins - What is a rock solid development and deployment workflow?

惠普OMEN 15-CE001TX 2EF91PA参数报价

苹果新款MacBook Pro 15英寸 i732GB1TBVega Pro 20参数报价

联想Y330A-PSE L参数报价

神舟战神Z7 D6 i7-12650H16GB512GBRTX4050旗舰版参数报价

神舟战神Z7 D6 i7-12650H16GB1TBRTX4050参数报价

fortran - 6MPI waitall error "The supplied request in array element 0 was invalid (kind=0)" - Stack Overflow

Oracle 19c Why we can't loop through ORA-01013 exception (when user cancelled sql) - Stack Overflow

angular - [routerLink]="" VS href="javascript:void(0);" - Stack Overflow

javascript - "search" field to filter content - Stack Overflow

javascript - can't resolve getaddrinfo EAI_AGAIN error - Stack Overflow

javascript - Phonegap Camera API - Cannot read property 'DATA_URL' of undefined - Stack Overflow

javascript - Angularjs Multiple Select: Cannot read property 'length' of undefined - Stack Overflow