Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Patrick Jessee (pj_at_[hidden])
Date: 2006-06-28 10:31:36


Brian Barrett wrote:

>On Wed, 2006-06-28 at 09:43 -0400, Patrick Jessee wrote:
>
>
>>Hello. I've tracked down the source of the previously reported startup
>>problem with Openmpi 1.1. On startup, it fails with the messages:
>>
>>mca_oob_tcp_accept: accept() failed with errno 9.
>> :
>>
>>This didn't happen with 1.0.2.
>>
>>The trigger for this behavior is if standard input happens to be closed
>>before calling mpirun. In this particular case, mpirun was being
>>started by a wrapper Bourne shell script that had standard input
>>closed. It's fairly easy to reproduce. Interestingly, the problem is
>>not seen if standard input is opened from an arbitrary device such as
>>/dev/null.
>>
>>This is the first MPI with which we've seen this behavior, and it didn't
>>happen with 1.0.2 so something must have been introduced in 1.1.
>>Perhaps 1.1 makes some assumptions about the state of the standard file
>>descriptors.
>>
>>Hopefully this feedback is helpful to someone in resolving the problem.
>>
>>
>
>Yup, in order to fix some other things with standard input that users
>rightly were complaining about, we changed some standard input handling
>between 1.0.2 and 1.1. My recommendation is to just tie it to /dev/null
>instead. We're unlikely to fix this issue in the near future.
>
>
Thanks for the reply. We can work around the issue in the near future;
however, this seems like a restriction/assumption that could possibly be
addressed in OpenMPI in the long run. (It's easy to work-around/avoid
once you know what the issue/restriction is, but tracking down the
problem takes some time.) Anyway, perhaps this it could be placed on a
todo list so it doesn't get lost. I'd be happy to provide any
additional information if needed.

Regards,

Patrick