My actual problem is that i don't know where is the struct that has the
information that is used to send messages to the procs.
Because what i need is to update it when i move a process from its original
site, is there something like this??
Thanks a lot.
2011/5/31 Hugo Meyer <meyer.hugo_at_[hidden]>
> Hello @ll.
> I'm needing some help to restart the communication with a process that i
> restore in a different node. My situation is as follows:
> The process fails and it's restored in another node succesfully from a
> previous checkpoint that i sent there. Now, when a process try to send a
> message to this restored process it will fail, or at least, it will be
> locked in *ompi_request_wait_completion. *
> So, when this happens i have to send a message to the daemon of the sender
> that will have the uri of where the process has been restored and answer to
> the proc with this and it will update this info.
> So, i need to know where in the code i can capture this attempt to send and
> then send the message to his daemon and where and how i can update this info
> to send the message to the right place (Same rank but new uri).
> I have to do it in this way to avoid a collective communication.
> If you give me a hand on this, it will be great.
> Best regards.