Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] OMPI 1.3 - PERUSE peruse_comm_spec_t peer Negative Value
From: George Bosilca (bosilca_at_[hidden])
Date: 2009-03-25 13:54:26


I did a quick check and I think there is no other possibility to get
the peer equal to -1. Well, at least in theory. Please let me know if
you find any other problems with our PERUSE interface.

Meanwhile, I will take a look at the peruse.h header.

   Thanks,
     george.

On Mar 25, 2009, at 13:29 , Kiril Dichev wrote:

> Hi,
>
> at least for the specific test program I used, the negative values for
> the peer attribute disappeared after George's modifications in 20844.
>
> One remark: after installation, I had to remove the '#include
> "ompi_config.h"' line in the "include/peruse.h" header to get PERUSE
> applications to compile. Otherwise I got a missing header error
> message
> for ompi_config.h.
>
> Regards,
> Kiril
>
>
> On Mon, 2009-03-23 at 16:34 -0400, George Bosilca wrote:
>> You are absolutely right, the peer should never be set to -1 on any
>> of
>> the PERUSE callbacks. I checked the code this morning and figure out
>> what was the problem. We report the peer and the tag attached to a
>> request before setting the right values (some code moved around). I
>> submitted a patch and created a "move request" to have this
>> correction
>> as soon as possible on one of our stable releases. The move request
>> can be followed using our TRAC system and the following link (https://svn.open-mpi.org/trac/ompi/ticket/1845
>> ). If you want to play with this change please update your Open MPI
>> installation to a nightly build or a fresh checkout from the SVN with
>> at least revision 20844 (a nightly including this change will be
>> posted on our website tomorrow morning).
>>
>> Thanks,
>> george.
>>
>> On Mar 23, 2009, at 13:23 , Samuel K. Gutierrez wrote:
>>
>>> Hi Kiril,
>>>
>>> Appreciate the quick response.
>>>
>>>> Hi Samuel,
>>>>
>>>> On Sat, 21 Mar 2009 18:18:54 -0600 (MDT)
>>>> "Samuel K. Gutierrez" <samuel_at_[hidden]> wrote:
>>>>> Hi All,
>>>>>
>>>>> I'm writing a simple profiling library which utilizes
>>>>> PERUSE. My callback
>>>>
>>>> So am I :)
>>>>
>>>>> function counts communication events (see example code
>>>>> below). I noticed
>>>>> that in OMPI v1.3 spec->peer is sometimes a negative
>>>>> value (OMPI v1.2.6
>>>>> did not exhibit this behavior). I added some boundary
>>>>> checks, but it
>>>>> seems as if this is a bug? I hope I'm not missing
>>>>> something...
>>>>
>>>> It took me quite some time to reproduce the error - I also
>>>
>>> Sorry about that - I should have provided more information.
>>>
>>>> got peer value "-1" for the Peruse peruse_comm_spec_t
>>>> struct. I only managed to reproduce this with
>>>> communication of a process with itself, which is an
>>>> unusual scenario. Anyway, for all the tests I did, the
>>>> error happened only when:
>>>>
>>>> -a process communicates with itself
>>>> -the MPI receive call is made
>>>> -the Peruse event "PERUSE_COMM_MSG_REMOVE_FROM_UNEX_Q" is
>>>> triggered
>>>
>>> That's interesting... Nice work!
>>>
>>>>
>>>>
>>>> The file ompi/mca/pml/ob1/pml_ob1_recvreq.c seems to be
>>>> the place where the above event is called with a wrong
>>>> value of the peer attribute.
>>>>
>>>> I will let you know if I find something.
>>>
>>> I will also take a look.
>>>
>>>>
>>>>
>>>> Best regards,
>>>> Kiril
>>>>
>>>>>
>>>>> The peruse test provided in the OMPI v1.3 source
>>>>> exhibits similar behavior:
>>>>> mpirun -np 2 ./mpi_peruse | grep peer:-1
>>>>>
>>>>> int callback(peruse_event_h event_h, MPI_Aint unique_id,
>>>>> peruse_comm_spec_t *spec, void *param) {
>>>>> if (spec->peer == rank) {
>>>>> return MPI_SUCCESS;
>>>>> }
>>>>> rrCounts[spec->peer]++;
>>>>> return MPI_SUCCESS;
>>>>> }
>>>>>
>>>>>
>>>>> Any insight is greatly appreciated.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Samuel K. Gutierrez
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>>
>>>
>>> Appreciate the help,
>>>
>>> Samuel K. Gutierrez
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>