Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] RML Send
From: Ralph H Castain (rhc_at_[hidden])
Date: 2008-06-19 10:43:46


WOW! Somebody really screwed up the DSS by adding some new API's I'd never
heard of before, but really can cause the system to break!

I'm going to have to straighten this mess out - it is a total disaster.
There needs to be just ONE way of packing and unpacking, not two totally
incompatible methods.

Will let you know when it is fixed - probably early next week.
Ralph
 

On 6/19/08 8:34 AM, "Leonardo Fialho" <lfialho_at_[hidden]> wrote:

> Hi Ralph,
>
> Mi mistake, I'm really using ORTE_PROC_MY_DAEMON->jobid.
>
> I have success using pack_buffer()/unpack_buffer() and OPAL_BYTE type,
> something strange occur when I was using pack()/unpack(). The value of
> num_bytes increase, example:
> I tried to read num_bytes=5, and after a unpack this var have 33! I
> don't understand it...
>
> Thanks,
> Leonardo Fialho
>
> Ralph Castain escribió:
>>
>> On 6/17/08 3:35 PM, "Leonardo Fialho" <lfialho_at_[hidden]> wrote:
>>
>>
>>> Hi Ralph,
>>>
>>> 1) Yes, I'm using ORTE_RML_TAG_DAEMON with a new "command" that I
>>> defined in "odls_types.h".
>>> 2) I'm packing and unpacking variables like OPAL_INT, OPAL_SIZE, ...
>>> 3) I'm not blocking the "process_commands" function with long code.
>>> 4) To know the daemon's vpid and jobid I used the same jobid from the
>>> app (in this solution, I can be changed) and the vpid is ordered
>>> sequentially (0 for mpirun and 1 to N for the orted's).
>>>
>>
>> The jobid of the daemons is different from the jobid of the apps. So at the
>> moment, you are actually sending the message to another app!
>>
>> You can find the jobid of the daemons by extracting it as
>> ORTE_PROC_MY_DAEMON->jobid. Please note, though, that the app has no
>> knowledge of the contact info for that daemon, so this message will have to
>> route through the local daemon. Happens transparently, but just wanted to be
>> clear as to how this is working.
>>
>>
>>> The problems is: I need to send a buffered data, and I don't know the
>>> type of this data. I'm trying to use OPAL_NULL and OPAL_DATA_VALUE to
>>> send it but I got no success.... :(
>>>
>>
>> If I recall correctly, you were trying to archive messages that flowed
>> through the PML - correct? I would suggest just treating them as bytes and
>> packing them as an opal_byte_object_t, something like this:
>>
>> opal_byte_object_t bo;
>>
>> bo.size = sizeof(my-data);
>> bo.data = *my_data;
>>
>> opal_dss.pack(*buffer, &bo, 1, OPAL_BYTE_OBJECT);
>>
>> Then on the other end:
>>
>> opal_byte_object_t *bo;
>> int32_t n;
>>
>> opal_dss.unpack(*buffer, &bo, &n, OPAL_BYTE_OBJECT);
>>
>> You can then transfer the data into whatever storage you like. All this does
>> is pass the #bytes and the bytes as a collected unit - you could, of course,
>> simply pass the #bytes and bytes with independent packs if you wanted:
>>
>> int32_t num_bytes;
>> uint8_t *my_data;
>>
>> opal_dss.pack(*buffer, &num_bytes, 1, OPAL_INT32);
>> opal_dss.pack(*buffer, my-data, num_bytes, OPAL_BYTE);
>>
>> ...
>>
>> opal_dss.unpack(*buffer, &num_bytes, &n, OPAL_INT32);
>> my_data = (uint8_t*)malloc(num_bytes);
>> opal_dss.unpack(*buffer, &my_data, &num_bytes, OPAL_BYTE);
>>
>>
>> Up to you.
>>
>> Hope that helps
>> Ralph
>>
>>
>>> Thanks in advance,
>>> Leonardo Fialho
>>>
>>>
>>> Ralph H Castain escribió:
>>>
>>>> I'm not sure exactly how you are trying to do this, but the usual procedure
>>>> would be:
>>>>
>>>> 1. call opal_dss.pack(*buffer, *data, #data, data_type) for each thing you
>>>> want to put in the buffer. So you might call this to pack a string:
>>>>
>>>> opal_dss.pack(*buffer, &string, 1, OPAL_STRING);
>>>>
>>>> 2. once you have everything packed into the buffer, you send the buffer
>>>> with
>>>>
>>>> orte_rml.send_buffer(*dest, *buffer, dest_tag, 0);
>>>>
>>>> What you will need is a tag that the daemon is listening on that won't
>>>> interfere with its normal operations - i.e., what you send won't get held
>>>> forever waiting to get serviced, and your servicing won't block us from
>>>> responding to a ctrl-c. You can probably use ORTE_RML_TAG_DAEMON, but you
>>>> need to ensure you don't block anything.
>>>>
>>>> BTW: how is the app figuring out the name of the remote daemon? The proc
>>>> will have access to the daemon's vpid (assuming it knows the nodename where
>>>> the daemon is running) in the ESS, but not the jobid - I assume you are
>>>> using some method to compute the daemon jobid from the apps?
>>>>
>>>>
>>>> On 6/17/08 12:08 PM, "Leonardo Fialho" <lfialho_at_[hidden]> wrote:
>>>>
>>>>
>>>>
>>>>> Hi All,
>>>>>
>>>>> I´m using RML to send log messages from a PML to a ORTE daemon (located
>>>>> in another node). I got success sending the message header, but now I
>>>>> need to send the message data (buffer). How can I do it? The problem is
>>>>> what data type I need to use for packing/unpacking? I tried
>>>>> OPAL_DATA_VALUE but don´t get success...
>>>>>
>>>>> Thanks,
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>>
>>
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>