Thanks Ralph for your reply.

2011/3/21 Ralph Castain <rhc@open-mpi.org>
You should never access a pointer array's data area that way (i.e., by index against the raw data). You really should do:

if (NULL == (proc = (orte_proc_t*)opal_pointer_array_get_item(jdata->procs, vpid))) {
      /* error report */
}


About this, i've changed this in my code but i'm getting the same result. Null when asking about a dead process.
 
The errmgr generally doesn't remove a process object upon failure - it just sets its state to some appropriate value. However, depending upon where you are trying to do this, and the history that got you down this code path, it is possible.

I'm writing this code into the errmgr_orted.c, and it is executed when a process fails. 
 

Also, remember that if you are in a daemon, then the jdata objects are not populated. The daemons work exclusively from the orte_local_jobdata and orte_local_children lists, so you would have to find your process there.

That's why i'm asking to the hnp about the jdata using ORTE_DAEMON_REPORT_JOB_INFO_CMD, i assume that he has the information about the dead process.

Any idea?

Best regards.

Hugo Meyer