Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [OMPI svn] svn:open-mpi r18303
From: Aurélien Bouteiller (bouteill_at_[hidden])
Date: 2008-04-25 19:38:53


To bounce on last George remark, currently when a job dies without
unsubscribing a port with Unpublish(due to poor user programming,
failure or abort), ompi-server keeps the reference forever and a new
application can therefore not publish under the same name again. So I
guess this is a good point to cleanup correctly all published/opened
ports, when the application is ended (for whatever reason).

Another cool feature could be to have mpirun behave as an ompi-server,
and publish a suitable URI if requested to do so (if the urifile does
not exist yet ?). I know from the source code that mpirun is already
including anything needed to offer this feature, exept the ability to
provide a suitable URI.

   Aurelien

Le 25 avr. 08 à 19:19, George Bosilca a écrit :

> Ralph,
>
> Thanks for your concern regarding the level of compliance of our
> implementation of the MPI standard. I don't know who were the MPI
> gurus you talked with about this issue, but I can tell that for once
> the MPI standard is pretty clear about this.
>
> As stated by Aurelien in his last email, using the plural in several
> sentences, strongly suggest that the status of port should not be
> implicitly modified by MPI_Comm_accept or MPI_Comm_connect.
> Moreover, in the beginning of the chapter in the MPI standard, it is
> specified that comm/accept work exactly as in TCP. In other words,
> once the port is opened it stay open until the user explicitly close
> it.
>
> However, not all corner cases are addressed by the MPI standard.
> What happens on MPI_Finalize ... it's a good question. Personally, I
> think we should stick with the TCP similarities. The port should be
> not only closed by unpublished. This will solve all issues with
> people trying to lookup a port once the originator is gone.
>
> george.
>
> On Apr 25, 2008, at 5:25 PM, Ralph Castain wrote:
>
>> As I said, it makes no difference to me. I just want to ensure that
>> everyone
>> agrees on the interpretation of the MPI standard. We have had these
>> discussion in the past, with differing views. My guess here is that
>> the port
>> was left open mostly because the person who wrote the C-binding
>> forgot to
>> close it. ;-)
>>
>> So, you MPI folks: do we allow multiple connections against a
>> single port,
>> and leave the port open until explicitly closed? If so, then do we
>> generate
>> an error if someone calls MPI_Finalize without first closing the
>> port? Or do
>> we automatically close any open ports when finalize is called?
>>
>> Or do we automatically close the port after the connect/accept is
>> completed?
>>
>> Thanks
>> Ralph
>>
>>
>>
>> On 4/25/08 3:13 PM, "Aurélien Bouteiller" <bouteill_at_[hidden]>
>> wrote:
>>
>>> Actually, the port was still left open forever before the change.
>>> The
>>> bug damaged the port string, and it was not usable anymore, not only
>>> in subsequent Comm_accept, but also in Close_port or Unpublish_name.
>>>
>>> To more specifically answer to your open port concern, if the user
>>> does not want to have an open port anymore, he should specifically
>>> call MPI_Close_port and not rely on MPI_Comm_accept to close it.
>>> Actually the standard suggests the exact contrary: section 5.4.2
>>> states "it must call MPI_Open_port to establish a port [...] it must
>>> call MPI_Comm_accept to accept connections from clients". Because
>>> there is multiple clients AND multiple connections in that
>>> sentence, I
>>> assume the port can be used in multiple accepts.
>>>
>>> Aurelien
>>>
>>> Le 25 avr. 08 à 16:53, Ralph Castain a écrit :
>>>
>>>> Hmmm...just to clarify, this wasn't a "bug". It was my
>>>> understanding
>>>> per the
>>>> MPI folks that a separate, unique port had to be created for every
>>>> invocation of Comm_accept. They didn't want a port hanging around
>>>> open, and
>>>> their plan was to close the port immediately after the connection
>>>> was
>>>> established.
>>>>
>>>> So dpm_orte was written to that specification. When I reorganized
>>>> the code,
>>>> I left the logic as it had been written - which was actually done
>>>> by
>>>> the MPI
>>>> side of the house, not me.
>>>>
>>>> I have no problem with making the change. However, since the
>>>> specification
>>>> was created on the MPI side, I just want to make sure that the MPI
>>>> folks all
>>>> realize this has now been changed. Obviously, if this change in
>>>> spec
>>>> is
>>>> adopted, someone needs to make sure that the C and Fortran
>>>> bindings -
>>>> do not-
>>>> close that port any more!
>>>>
>>>> Ralph
>>>>
>>>>
>>>>
>>>> On 4/25/08 2:41 PM, "bouteill_at_[hidden]" <bouteill_at_[hidden]>
>>>> wrote:
>>>>
>>>>> Author: bouteill
>>>>> Date: 2008-04-25 16:41:44 EDT (Fri, 25 Apr 2008)
>>>>> New Revision: 18303
>>>>> URL: https://svn.open-mpi.org/trac/ompi/changeset/18303
>>>>>
>>>>> Log:
>>>>> Fix a bug that rpevented to use the same port (as returned by
>>>>> Open_port) for
>>>>> several Comm_accept)
>>>>>
>>>>>
>>>>> Text files modified:
>>>>> trunk/ompi/mca/dpm/orte/dpm_orte.c | 19 ++++++++++---------
>>>>> 1 files changed, 10 insertions(+), 9 deletions(-)
>>>>>
>>>>> Modified: trunk/ompi/mca/dpm/orte/dpm_orte.c
>>>>> =
>>>>> =
>>>>> =
>>>>> =
>>>>> =
>>>>> =
>>>>> =
>>>>> =
>>>>> =
>>>>> =
>>>>> =
>>>>> =
>>>>> ==================================================================
>>>>> --- trunk/ompi/mca/dpm/orte/dpm_orte.c (original)
>>>>> +++ trunk/ompi/mca/dpm/orte/dpm_orte.c 2008-04-25 16:41:44 EDT
>>>>> (Fri, 25 Apr
>>>>> 2008)
>>>>> @@ -848,8 +848,14 @@
>>>>> {
>>>>> char *tmp_string, *ptr;
>>>>>
>>>>> + /* copy the RML uri so we can return a malloc'd value
>>>>> + * that can later be free'd
>>>>> + */
>>>>> + tmp_string = strdup(port_name);
>>>>> +
>>>>> /* find the ':' demarking the RML tag we added to the end */
>>>>> - if (NULL == (ptr = strrchr(port_name, ':'))) {
>>>>> + if (NULL == (ptr = strrchr(tmp_string, ':'))) {
>>>>> + free(tmp_string);
>>>>> return NULL;
>>>>> }
>>>>>
>>>>> @@ -863,15 +869,10 @@
>>>>> /* see if the length of the RML uri is too long - if so,
>>>>> * truncate it
>>>>> */
>>>>> - if (strlen(port_name) > MPI_MAX_PORT_NAME) {
>>>>> - port_name[MPI_MAX_PORT_NAME] = '\0';
>>>>> + if (strlen(tmp_string) > MPI_MAX_PORT_NAME) {
>>>>> + tmp_string[MPI_MAX_PORT_NAME] = '\0';
>>>>> }
>>>>> -
>>>>> - /* copy the RML uri so we can return a malloc'd value
>>>>> - * that can later be free'd
>>>>> - */
>>>>> - tmp_string = strdup(port_name);
>>>>> -
>>>>> +
>>>>> return tmp_string;
>>>>> }
>>>>>
>>>>> _______________________________________________
>>>>> svn mailing list
>>>>> svn_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/svn
>>>>
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel