Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Dropped message for the non-existing communicator
From: Terry Dontje (Terry.Dontje_at_[hidden])
Date: 2008-11-08 14:11:30


This was on v1.3 r19845.

--td

Edgar Gabriel wrote:
> Terry,
>
> was this with the trunk or v1.3? If it was the trunk, was it before
> r19929 was applied? The reason I ask is because r19929 should remove
> all error messages related to 'non-existing communictors'. Hierarch
> btw. is not the cause for the error messages even before that, it just
> exposes it more frequently...
>
> Thanks
> Edgar
>
> Terry Dontje wrote:
>> I am seeing the message "Dropped message for the non-existing
>> communicator" when running hpcc with np=124 against r19845. This
>> seems to be pretty reproducible at np=124. When the job prints out
>> the message above some set of processes are in an MPI_Bcast and the
>> 15 processes reporting the message are stuck in MPI_Barrier.
>> I am not sure how related this is to #1408 since I am not invoking
>> the hierarchical collectives. I just wanted to see if anyone else
>> has tried to run hpcc at such an np size with any success.
>>
>> My next steps are to try to run this with the latest trunk and to
>> narrow down the failing case.
>>
>> --td
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>