On 3/5/08 3:31 PM, "Aurélien Bouteiller" <bouteill_at_[hidden]> wrote:
> From : Camille Coti <coti_at_[hidden]>
> Date : 5 mars 2008 17:26:32 HNE (ÉUA)
> to : devel_at_[hidden]
> Objet : orte_job_data
> I had a look at how the job data are kept during the life cycle of a
> The orte_job_data pointer array contains two elements:
> addr is filled during rte_init().
> addr is filled by PLM at setup time and contains the map on
> which the job is spawned.
> What does the first entry correspond to?
The first element in the array is for the daemon job. All subsequent entries
correspond to applications.
> Besides, when I dump the map contained in the second element during
> the execution of my application (ie, after PLM launched the job), each
> node entry contains: "Daemon launched: False". Is this expected?
Yes, that is expected - we don't bother to update that field in the job map
when we launch the daemons for an application. However, the next time the
job map is retrieved - say to map a comm_spawn'd application - then we do
fill it in. The get_job_map function will check the daemon's job data object
to see if a daemon has been launched on each node and fill-in the field.
Reason we don't automatically update that field as the daemons launch is
that we are looking ahead to when we minimize memory usage. When that
happens, we probably won't retain the map at all - we'll reconstruct
whatever is needed from the minimal stored set of data when it is needed,
and then dump it when done.
So, since we'll probably have to regenerate the map anyway, we don't bother
to maintain the daemon spawned field. We just fill it in when we retrieve a
map. If you want to see the filled-in values, call orte_rmaps.get_job_map
and then dump the returned map object.
> devel mailing list