Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Dropped message for the non-existing communicator
From: Terry Dontje (Terry.Dontje_at_[hidden])
Date: 2008-11-08 14:12:56


Ok, I'll try this with the latest trunk.

thanks,

--td
George Bosilca wrote:
> Apparently it was with 19845, so before the patch that is supposed to
> fix this issue. Terry can you please test with a more recent version
> (> 19929).
>
> Thanks,
> george.
>
> On Nov 8, 2008, at 9:54 AM, Edgar Gabriel wrote:
>
>> Terry,
>>
>> was this with the trunk or v1.3? If it was the trunk, was it before
>> r19929 was applied? The reason I ask is because r19929 should remove
>> all error messages related to 'non-existing communictors'. Hierarch
>> btw. is not the cause for the error messages even before that, it
>> just exposes it more frequently...
>>
>> Thanks
>> Edgar
>>
>> Terry Dontje wrote:
>>> I am seeing the message "Dropped message for the non-existing
>>> communicator" when running hpcc with np=124 against r19845. This
>>> seems to be pretty reproducible at np=124. When the job prints out
>>> the message above some set of processes are in an MPI_Bcast and the
>>> 15 processes reporting the message are stuck in MPI_Barrier.
>>> I am not sure how related this is to #1408 since I am not invoking
>>> the hierarchical collectives. I just wanted to see if anyone else
>>> has tried to run hpcc at such an np size with any success.
>>> My next steps are to try to run this with the latest trunk and to
>>> narrow down the failing case.
>>> --td
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>> --
>> Edgar Gabriel
>> Assistant Professor
>> Parallel Software Technologies Lab http://pstl.cs.uh.edu
>> Department of Computer Science University of Houston
>> Philip G. Hoffman Hall, Room 524 Houston, TX-77204, USA
>> Tel: +1 (713) 743-3857 Fax: +1 (713) 743-3335
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>