Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] OpenMPI program getting stuck at poll()
From: Lenny Verkhovsky (lenny.verkhovsky_at_[hidden])
Date: 2009-03-10 11:49:58


Hi,
can you try Open MPI 1.3 version.

On 3/9/09, Prasanna Ranganathan <prasanna_at_[hidden]> wrote:
>
> Hi all,
>
> I have a distributed program running on 400+ nodes and using OpenMPI. I
> have run the same binary with nearly the same setup successfully previously.
> However in my last two runs the program seems to be getting stuck after a
> while before it completes. The stack trace at the time it gets stuck is as
> follows:
>
> #0 0x00002ad0000c00df in poll () from /lib/libc.so.6
> #1 0x00002acfffa49c27 in opal_poll_dispatch () from
> /usr/lib64/libopen-pal.so.0
> #2 0x00002acfffa47add in opal_event_base_loop () from
> /usr/lib64/libopen-pal.so.0
> #3 0x00002acfffa43203 in opal_progress () from /usr/lib64/libopen-pal.so.0
> #4 0x00002acfff78b315 in ompi_request_test_some () from
> /usr/lib64/libmpi.so.0
> #5 0x00002acfff7adf7a in PMPI_Testsome () from /usr/lib64/libmpi.so.0
> ....
>
> I checked all the nodes and they seem to be up and doing fine. Any
> suggestions/hints on what might be happening here would help greatly. Thanks
> in advance.
>
> I am using OpenMPI 1.2.7 on gentoo linux.
>
> Regards,
>
> Prasanna.
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>