Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Strange behaviour of SGE+OpenMPI
From: Dave Love (d.love_at_[hidden])
Date: 2009-04-01 12:54:27


Rolf Vandevaart <Rolf.Vandevaart_at_[hidden]> writes:

> No, orte_leave_session_attached is needed to avoid the errno=2 errors
> from the sm btl. (It is fixed in 1.3.2 and trunk)

[It does cause other trouble, but I forget what the exact behaviour was
when I lost it as a default.]

>> Yes, but there's a problem with the recommended (as far as I remember)
>> setup, with one slot per node to ensure a single job per node. In that
>> case, you have no control over allocation -- -bynode and -byslot are
>> equivalent, which apparently can badly affect some codes. We're
>> currently using a starter to generate a hosts file for that reason
                     ^^^^^^^

I meant queue prologue, not pe starter method.

>> (complicated by having dual- and quad-core nodes) and would welcome a
>> better idea.
>>
> I am not sure what you are asking here. Are you trying to get a
> single MPI process per node? You could use -npernode 1. Sorry for my
> confusion.

No. It's an SGE issue, not an Open MPI one, but to try to explain
anyhow: People normally want to ensure that a partially-full node
running an MPI job doesn't get anything else scheduled on it. E.g. on
8-core nodes, if you submit a 16-process job, there are four cores left
over on the relevant nodes which might get something else scheduled on
them. Using one slot per node avoids that, but means generating your
own hosts file if you want -bynode and -byslot not to be equivalent.