Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Quick fix for MPI_Publish_name
From: Ralph Castain (rhc_at_[hidden])
Date: 2011-03-05 12:14:51


Unfortunately, it isn't quite that simple, but I do appreciate the suggestion - and the prod to get this fixed!

The change was required to help tools properly connect in some scenarios. Unfortunately, the logic was too simple and broke the ompi-server case. I've fixed it in the trunk, and will port the fix to 1.4 and 1.5 series.

Thanks again!
Ralph

On Mar 4, 2011, at 9:36 AM, Suraj Prabhakaran wrote:

> Hello,
>
> Referring to the following bug
>
> https://svn.open-mpi.org/trac/ompi/ticket/2681
>
> that MPI_Publish_name was hanging, in fact, any call that contacted the ompi-server was hanging. By looking at all communication between the application and ompi-server, it seemed that the ompi-server was getting the wrong/bad route to reach the application in order to send back the answer of publish/lookup/unpublish.
>
> In orte/mca/routed/binomial/routed_binomial.c, I found the following loc in the get_route() function,
>
> if (ORTE_PROC_IS_TOOL) {
> ret = target;
> goto found;
> }
>
> which, I believe, returned the target directly as the route to any tool. Comparing with 1.4.3, I could not understand the change that brought in the above case. So I simply commented it out and ompi-server worked perfect with all the calls doing their job.
> What I do not know if this affects any other tool.
> Hope this is useful.
>
> Best,
> Suraj Prabhakaran
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel