Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI devel] Update orte_proc structure
From: Leonardo Fialho (lfialho_at_[hidden])
Date: 2008-10-01 12:37:37


Forget it. I found the problem... a little patch to
orte_dt_pack/unpack_fns solve my problem...

Leonardo

Leonardo Fialho escribió:
> Hi All,
>
> I have a little doubt about how to update the orte_proc structure.
>
> I have modified the orte_proc structure to include another field
> (orte_name_proc_t type) to describe the node whose store my
> checkpoints and logs:
>
> struct orte_proc_t {
> ...
> #if OPAL_ENABLE_FT_RADIC == 1
> /* protector node */
> orte_process_name_t protector;
> #endif
> };
>
> Thus, I have added in orted_comm.c a code which I think that would
> update de job structure:
> /* Update the structure */
> if (NULL == (jdata = orte_get_job_data_object(sender_jobid))) {
> ORTE_ERROR_LOG(ORTE_ERR_NOT_FOUND);
> goto CLEANUP;
> }
> procs = (orte_proc_t**)jdata->procs->addr;
> if (NULL == procs[sender_vpid] ) {
> ORTE_ERROR_LOG(ORTE_ERR_NOT_FOUND);
> goto CLEANUP;
> }
> procs[sender_vpid]->protector.jobid = protector_jobid;
> procs[sender_vpid]->protector.vpid = protector_vpid;
> opal_output(0, "%s is the protector of %s",
> ORTE_NAME_PRINT(&procs[sender_vpid]->name),
> ORTE_NAME_PRINT(&procs[sender_vpid]->protector));
>
> In the log of the orte daemon which acts as HNP I can see correct
> informations which was added to the orte_proc structure, but, when I
> use my modified version of orte-ps I found incorrect information
> ([[INVALID],INVALID]). Bellow is the code I have used in orte-ps:
>
> #if OPAL_ENABLE_FT_RADIC == 1
> protector = orte_util_print_name_args(&vpid->protector);
> printf("%*s |", len_protector, protector);
> #endif
>
> The question is: why the HNP show the correct information, and the
> orte-ps don´t?
>
> Thanks

-- 
Leonardo Fialho
Computer Architecture and Operating Systems Department - CAOS
Universidad Autonoma de Barcelona - UAB
ETSE, Edifcio Q, QC/3088
http://www.caos.uab.es
Phone: +34-93-581-2888
Fax: +34-93-581-2478