Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] Update orte_proc structure
From: Leonardo Fialho (lfialho_at_[hidden])
Date: 2008-10-01 11:09:18


Hi All,

I have a little doubt about how to update the orte_proc structure.

I have modified the orte_proc structure to include another field
(orte_name_proc_t type) to describe the node whose store my checkpoints
and logs:

struct orte_proc_t {
...
#if OPAL_ENABLE_FT_RADIC == 1
    /* protector node */
    orte_process_name_t protector;
#endif
};

Thus, I have added in orted_comm.c a code which I think that would
update de job structure:
/* Update the structure */
if (NULL == (jdata = orte_get_job_data_object(sender_jobid))) {
    ORTE_ERROR_LOG(ORTE_ERR_NOT_FOUND);
   goto CLEANUP;
}
procs = (orte_proc_t**)jdata->procs->addr;
if (NULL == procs[sender_vpid] ) {
    ORTE_ERROR_LOG(ORTE_ERR_NOT_FOUND);
    goto CLEANUP;
}
procs[sender_vpid]->protector.jobid = protector_jobid;
procs[sender_vpid]->protector.vpid = protector_vpid;
opal_output(0, "%s is the protector of %s",
ORTE_NAME_PRINT(&procs[sender_vpid]->name),
ORTE_NAME_PRINT(&procs[sender_vpid]->protector));

In the log of the orte daemon which acts as HNP I can see correct
informations which was added to the orte_proc structure, but, when I use
my modified version of orte-ps I found incorrect information
([[INVALID],INVALID]). Bellow is the code I have used in orte-ps:

#if OPAL_ENABLE_FT_RADIC == 1
        protector = orte_util_print_name_args(&vpid->protector);
        printf("%*s |", len_protector, protector);
#endif

The question is: why the HNP show the correct information, and the
orte-ps don´t?

Thanks

-- 
Leonardo Fialho
Computer Architecture and Operating Systems Department - CAOS
Universidad Autonoma de Barcelona - UAB
ETSE, Edifcio Q, QC/3088
http://www.caos.uab.es
Phone: +34-93-581-2888
Fax: +34-93-581-2478