Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [PATCH] iof/hnp: daemon part of the sink structure is not initialized when forwarding stdin to all ranks
From: Ralph Castain (rhc_at_[hidden])
Date: 2012-03-06 10:47:45


You are quite right - good catch! Fixed in trunk with r26107 - will file
patch for 1.5.
Ralph

On Tue, Mar 6, 2012 at 4:18 AM, nadia.derbey <Nadia.Derbey_at_[hidden]> wrote:

> Hi,
>
> When forwarding stdin to all ranks in the job (mpirun --stdin all), the
> following error message is output:
>
> ------------------
> [berlin73:02223] [[56600,0],0] ORTE_ERROR_LOG: A message is attempting
> to be sent to a process whose contact information is unknown in
> file ../../../../../orte/mca/rml/oob/rml_oob_send.c at line 316
> [berlin73:02223] [[56600,0],0] unable to find address for
> [[INVALID],INVALID]
> [berlin73:02223] [[56600,0],0] ORTE_ERROR_LOG: A message is attempting
> to be sent to a process whose contact information is unknown in
> file ../../../../../orte/mca/iof/hnp/iof_hnp_send.c at line 116
> ------------------
>
> This is due to the daemon part of the sink structure not beeing
> initialized in hnp_push() when the destination vpid is
> ORTE_VPID_WILDCARD.
> And then, when orte_iof_hnp_read_local_handler() is called, it calls
> orte_iof_hnp_send_data_to_endpoint() with a sink->daemon that is not
> set.
> orte_iof_hnp_send_data_to_endpoint() in turn doesn't call
> orte_grpcomm.xcast() but orte_rml.send_buffer_nb() with an invalid host.
>
> The attached patch applied on the trunk solves the issue. This patch is
> trivial, but since it's the first time I have to look at iof code, I'm
> not sure of all its impacts...
>
> Regards,
> Nadia
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>