Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Fake Modex
From: Ralph Castain (rhc_at_[hidden])
Date: 2011-06-03 10:12:27


On Jun 3, 2011, at 8:03 AM, Hugo Meyer wrote:

> Hello Ralph.
>
> Are you talking about an MPI communication? If so, then you need to update every proc's modex info for the proc that moved - this is something stored in each MPI proc's memory, so it isn't something that you can just get from the daemon on-demand. You'll have to provide the update to every single proc directly so that it has the info if/when it should decide to send an MPI message to the proc that moved.
> Yes, about MPI communications.
>
> See the modex database interface in orte/mca/grpcomm/base/grpcomm_base_modex.c. You'll have to create new code to send/recv an update message, but the code to update the database entry exists.
>
> What you mean with a send/recv update message i think that has to be something similar to pack/unpack info maybe using also the allgather like it's done in grpcomm_base_modex.c
>
> I took a look to the code and i found the orte_grpcomm_base_update_modex_entries(&proc_name, &rbuf) function, and then i printed the attr_name and i get btl.tcp.1.7 and others attributes, but i'm not finding any information about the uri, address or something that allows me to communicate with another peer.
>
> I'm thinking that i have to (in some way) update the endpoint in some place, but i don't know frome where i can do this, and if there is a function that allows me to do that kind of update.

When an MPI proc calls MPI_Init, each btl pushes its contact info into the modex database - one example is the btl.tcp.1.7 info you found there. That entry is for the TCP btl, which is probably what you are looking for. There is no way for you to edit that data - each btl encodes it in its own way and then adds it to the modex.

After doing that, the MPI_Init procedure calls grpcomm.modex to distribute the data across all procs in the job. Unfortunately, being a collective, all procs must participate. In your case, you'll have to find a different way to do it. Upon receipt, each proc updates its own modex db to include the new info.

Look in orte/mca/grpcomm/bad/grpcomm_bad_module.c at the modex function and follow that code thru the grpcomm/base functions to see how the modex info is retrieved, passed, and decoded on the far end.

>
> Thanks again.
>
> Hugo
>
>
>
> 2011/6/3 Ralph Castain <rhc_at_[hidden]>
> Are you talking about an MPI communication? If so, then you need to update every proc's modex info for the proc that moved - this is something stored in each MPI proc's memory, so it isn't something that you can just get from the daemon on-demand. You'll have to provide the update to every single proc directly so that it has the info if/when it should decide to send an MPI message to the proc that moved.
>
> This is why we do a modex upon restart - sending the change to every MPI proc is hardly scalable minus a collective operation.
>
> See the modex database interface in orte/mca/grpcomm/base/grpcomm_base_modex.c. You'll have to create new code to send/recv an update message, but the code to update the database entry exists.
>
>
> On Jun 2, 2011, at 7:52 AM, Hugo Meyer wrote:
>
>> Hello again.
>>
>> My actual problem is that i don't know where is the struct that has the information that is used to send messages to the procs.
>>
>> Something like:
>>
>> Rank URI
>> 0 21222:tcp:192.168.1.1:1250
>> 1 21223:tcp:192.168.1.2:1250
>> ..... .....
>>
>>
>> Because what i need is to update it when i move a process from its original site, is there something like this??
>>
>> Thanks a lot.
>>
>> Hugo
>>
>> 2011/5/31 Hugo Meyer <meyer.hugo_at_[hidden]>
>> Hello @ll.
>>
>> I'm needing some help to restart the communication with a process that i restore in a different node. My situation is as follows:
>>
>> The process fails and it's restored in another node succesfully from a previous checkpoint that i sent there. Now, when a process try to send a message to this restored process it will fail, or at least, it will be locked in ompi_request_wait_completion.
>>
>> So, when this happens i have to send a message to the daemon of the sender that will have the uri of where the process has been restored and answer to the proc with this and it will update this info.
>>
>> So, i need to know where in the code i can capture this attempt to send and then send the message to his daemon and where and how i can update this info to send the message to the right place (Same rank but new uri).
>>
>> I have to do it in this way to avoid a collective communication.
>>
>> If you give me a hand on this, it will be great.
>>
>> Best regards.
>>
>> Hugo
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel