Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Error message when using MPI_Type_struct()
From: Thomas Ropars (tropars_at_[hidden])
Date: 2009-01-12 09:29:53


Hi Aurelien,

Thank you for your answer.

Aurélien Bouteiller wrote:
> Hi Thomas,
>
> The message you get comes from the convertor. The convertor is in
> charge of packing/unpacking the data. As you add yourself an extra
> int to the wire data, the convertor gets confused on the receiver
> side, as it gets a message that's not in the expected format.
>
That is not the problem because I also create using the same mechanisms
a new datatype including the piggybacked integer on receiver side. And
the thing is that my code works well but when the piggybacked integer
has been dynamically allocated, I get those messages ....
> What you should do in my opinion is to create a new convertor (there
> is an mca framework for this) that allocates this extra int for you.
> Then, because you will use the same convertor at both ends, you will
> be able to unpack correctly what you sent. As a free benefit, you will
> be able to use the mpool instead of malloc, which should moderate the
> overhead of creating the intermediate buffer.
I will have a look at that.

Thomas
>
> Aurelien
>
>
> Le 8 janv. 09 à 05:13, Thomas Ropars a écrit :
>
>> Hi,
>>
>> I submit again this old question because I didn't get any answer last
>> time.
>>
>> My problem is the following one:
>> I try to implement piggyback mechanisms. In fact I want to piggyback
>> an integer on every message.
>> To do that, I dynamically create new datatype for each send.
>> The code I use to do that is described below. This code works fine.
>> But if the integer I piggyback (named "pigg" in the code) is
>> allocated using malloc,I still get the good result, but I get the
>> following kind of message:
>>
>> ../../ompi/datatype/datatype_pack.h:38
>> Pointer 0xbff25fbc size 4 is outside [0xbff25fbc,0x911300c] for
>> base ptr (nil) count 1 and data
>> Datatype 0x9183be8[] size 8 align 4 id 0 length 3 used 2
>> true_lb -1074634820 true_ub 152121356 (true_extent 1226756176) lb
>> -1074634820 ub 152121356 (extent 1226756176)
>> nbElems 2 loops 0 flags 102 (commited )-c-----GD--[---][---]
>> contain MPI_INT
>> --C---P-D--[ C ][INT] MPI_INT count 1 disp 0xbff25fbc
>> (-1074634820) extent 4 (size 4)
>> --C---P-D--[ C ][INT] MPI_INT count 1 disp 0x9113008
>> (152121352) extent 4 (size 4)
>> -------G---[---][---] MPI_END_LOOP prev 2 elements first elem
>> displacement -1074634820 size of data 8
>> Optimized description
>> -cC---P-DB-[ C ][ERR] MPI_INT count 1 disp 0xbff25fbc
>> (-1074634820) extent 4 (size 4)
>> -cC---P-DB-[ C ][ERR] MPI_INT count 1 disp 0x9113008
>> (152121352) extent 4 (size 4)
>> -------G---[---][---] MPI_END_LOOP prev 2 elements first elem
>> displacement -1074634820 size of data 8
>>
>> My question is : what does this message means ? Is there an error in
>> my code ? and what can I do to avoid this message ?
>>
>> Regards,
>>
>> Thomas
>>
>> Thomas Ropars wrote:
>>> Hi,
>>>
>>> I'm currently implementing a mechanism to piggyback information on
>>> messages. On message sending, I dynamically create a new datatype
>>> composed of the original buffer and of the data to piggyback.
>>>
>>> For instance, if I want to piggyback an integer on each message, I
>>> use the following code:
>>>
>>> int send(void *buf,
>>> size_t count,
>>> struct ompi_datatype_t* datatype,
>>> int dst,
>>> int tag,
>>> mca_pml_base_send_mode_t sendmode,
>>> ompi_communicator_t* comm )
>>> {
>>> MPI_Datatype type[2];
>>> int blocklen[2];
>>> MPI_Aint disp[2];
>>> MPI_Datatype datatype_out;
>>> int piggy=0;
>>>
>>> type[0]=datatype;
>>> type[1]=MPI_INT;
>>> blocklen[0]=count;
>>> blocklen[1]=1;
>>>
>>> MPI_Address(buf,disp);
>>> MPI_Address(&piggy,disp+1);
>>>
>>> MPI_Type_struct(2, blocklen, disp, type, datatype_out);
>>>
>>> MPI_Type_commit(datatype_out);
>>>
>>> /* then I call the original send function and send my new
>>> datatype */
>>> original_send(MPI_BOTTOM, 1, datatype_out, dst, tag, sendmode,
>>> comm);
>>>
>>> }
>>>
>>> This code works well. But if the data I want to piggyback is
>>> dynamically allocated. I get this kind of error message:
>>>
>>> ../../ompi/datatype/datatype_pack.h:40
>>> Pointer 0x823fab0 size 4 is outside [0xbfef8920,0x823fab4] for
>>> base ptr (nil) count 1 and data
>>> Datatype 0x8240b90[] size 8 align 4 id 0 length 3 used 2
>>> true_lb -1074820832 true_ub 136575668 (true_extent 1211396500) lb
>>> -1074820832 ub 136575668 (extent 1211396500)
>>> nbElems 2 loops 0 flags 102 (commited )-c-----GD--[---][---
>>>
>>> Despite this message, the function works well too ...
>>>
>>> Can someone explain me what this message means ? It seems that in
>>> the first part of the error message, the lower bound and the upper
>>> bound of the datatype are switched, but I don't know why.
>>>
>>>
>>> Regards.
>>>
>>> Thomas Ropars
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users