Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] Waitall never returns
From: Ross Boylan (ross_at_[hidden])
Date: 2014-04-09 20:26:49

On Fri, 2014-04-04 at 22:40 -0400, George Bosilca wrote:
> Ross,
> I’m not familiar with the R implementation you are using, but bear with me and I will explain how you can all Open MPI about the list of all pending requests on a process. Disclosure: This is Open MPI deep voodoo, an extreme way to debug applications that might save you quite some time.
> The only thing you need is the communicator you posted your requests into, or at least a pointer to it. Then you attach to your process (or processes) with your preferred debugger and call
> mca_pml_ob1_dump(struct ompi_communicator_t* comm, int verbose)
> With gdb this should look like “call mca_pml_ob1_dump(my_comm, 1)”. This will dump human readable information about all the requests pending on a communicator (both sends and receives).
Thank you so much for the tip. After inserting a barrier failed to help
I decided to try this. After much messing around (details below):
BTL SM 0x7f615dea9660 endpoint 0x3c15d90 [smp_rank 5] [peer_rank 0]
BTL SM 0x7f615dea9660 endpoint 0x3b729e0 [smp_rank 5] [peer_rank 1]
BTL SM 0x7f615dea9660 endpoint 0x3b72ad0 [smp_rank 5] [peer_rank 2]
BTL SM 0x7f615dea9660 endpoint 0x3c06e60 [smp_rank 5] [peer_rank 3]
BTL SM 0x7f615dea9660 endpoint 0x3c06f50 [smp_rank 5] [peer_rank 4]
[n2:10664] [Rank 0]
[n2:10664] [Rank 1]
[n2:10664] [Rank 2]
[n2:10664] [Rank 3]
[n2:10664] [Rank 4]
[n2:10664] [Rank 5]
[n2:10664] [Rank 6]
[n2:10664] [Rank 7]
[n2:10664] [Rank 8]
[n2:10664] [Rank 9]
[n2:10664] [Rank 10]
[n2:10664] [Rank 11]
[n2:10664] [Rank 12]
[n2:10664] [Rank 13]

Not entirely human readable if the human is me!
Does smp_rank (and peer_rank) = what I would get from MPI_Comm_rank? I
hope so, because I was aiming for rank 5.
How do I know if I'm sending or receiving? They should all be sends.

What are all the lines like
[n2:10664] [Rank 7]?

What this seems to show is very odd.
First, my code thinks there are 3 outstanding Isends. Does this report
include requests that have become inactive (because complete)?

Second, during normal operations rank 5 does not talk to ranks 1-4.
I did put an MPI_Barrier in just before shutdown, but the trace
information indicates rank 5 never gets to that step.

To provide fuller context, and maybe some clues to others who attempt
this, I first tried this with my non-debug enabled libraries. I guessed
that the ranks were in the same order as the process numbers and invoked
gdb on my R executable giving the process number (once the system
reached its stuck state).

Accessing the communicator was tricky, via the comm variable defined in
the Rmpi library. So overall, the executable for R starts and loads the
Rmpi library. The latter in turn loads and references the MPI library.
The communicators are defined in the Rmpi library with MPI_Comm *comm,
and then one I need is comm[1].

When I tried to reference it I got an error that there was no debugging
info. I reconfigured MPI with --enable-debug and rebuilt it (make
clean all install). Then I launched everything again; I did not rebuild
Rmpi against the debug libraries, though I installed the debug libraries
in the old location for the regular ones.

I still had problems:
(gdb) p comm[1]
cannot subscript something of type `<data variable, no debug info>'
The error message I got before making MPI with debug was a bit different
and stronger,

I realized that comm was a symbol in Rmpi which I had not built with
debug symbols. Since MPI_Comm should now be understood by the debugger
I tried and explicit cast, which worked:
call mca_pml_ob1_dump(((MPI_Comm *) comm)[1], 1)

So I'm not entirely sure if the build of a debug version of MPI was

> If you are right, all processes will report NONE, and the bug is somewhere in-between your application and the MPI library. Otherwise, you might have some not-yet-completed requests pending…
> George.
> On Apr 4, 2014, at 22:20 , Ross Boylan <ross_at_[hidden]> wrote:
> > On 4/4/2014 6:01 PM, Ralph Castain wrote:
> >> It sounds like you don't have a balance between sends and recvs somewhere - i.e., some apps send messages, but the intended recipient isn't issuing a recv and waiting until the message has been received before exiting. If the recipient leaves before the isend completes, then the isend will never complete and the waitall will not return.
> > I'm pretty sure the sends complete because I wait on something that can only be computed after the sends complete, and I know I have that result.
> >
> > My current theory is that my modifications to Rmpi are not properly tracking all completed messages, resulting in it thinking there are outstanding messages (and passing a positive count to the C-level MPI_Waitall with associated garbagey arrays). But I haven't isolated the problem.
> >
> > Ross
> >>
> >>
> >> On Apr 4, 2014, at 5:20 PM, Ross Boylan <ross_at_[hidden]> wrote:
> >>
> >>> During shutdown of my application the processes issue a waitall, since they have done some Isends. A couple of them never return from that call.
> >>>
> >>> Could this be the result of some of the processes already being shutdown (the processes with the problem were late in the shutdown sequence)? If so, what is the recommended solution? A barrier?
> >>>
> >>> The shutdown proceeds in stages, but the processes in question are not told to shutdown until all the messages they have sent have been received. So there shouldn't be any outstanding messages from them.
> >>>
> >>> My reading of the manual is that Waitall with a count of 0 should return immediately, not hang. Is that correct?
> >>>
> >>> Running under R with openmpi 1.7.4.
> >>> _______________________________________________
> >>> users mailing list
> >>> users_at_[hidden]
> >>>
> >> _______________________________________________
> >> users mailing list
> >> users_at_[hidden]
> >>
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> >
> _______________________________________________
> users mailing list
> users_at_[hidden]