Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] [PATCH] iof/hnp: daemon part of the sink structure is not initialized when forwarding stdin to all ranks
From: nadia.derbey (Nadia.Derbey_at_[hidden])
Date: 2012-03-06 07:18:14


Hi,

When forwarding stdin to all ranks in the job (mpirun --stdin all), the
following error message is output:

------------------
[berlin73:02223] [[56600,0],0] ORTE_ERROR_LOG: A message is attempting
to be sent to a process whose contact information is unknown in
file ../../../../../orte/mca/rml/oob/rml_oob_send.c at line 316
[berlin73:02223] [[56600,0],0] unable to find address for
[[INVALID],INVALID]
[berlin73:02223] [[56600,0],0] ORTE_ERROR_LOG: A message is attempting
to be sent to a process whose contact information is unknown in
file ../../../../../orte/mca/iof/hnp/iof_hnp_send.c at line 116
------------------

This is due to the daemon part of the sink structure not beeing
initialized in hnp_push() when the destination vpid is
ORTE_VPID_WILDCARD.
And then, when orte_iof_hnp_read_local_handler() is called, it calls
orte_iof_hnp_send_data_to_endpoint() with a sink->daemon that is not
set.
orte_iof_hnp_send_data_to_endpoint() in turn doesn't call
orte_grpcomm.xcast() but orte_rml.send_buffer_nb() with an invalid host.

The attached patch applied on the trunk solves the issue. This patch is
trivial, but since it's the first time I have to look at iof code, I'm
not sure of all its impacts...

Regards,
Nadia