Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] Strange behaviour of SGE+OpenMPI
From: Dave Love (d.love_at_[hidden])
Date: 2009-04-01 12:54:27


Rolf Vandevaart <Rolf.Vandevaart_at_[hidden]> writes:

> No, orte_leave_session_attached is needed to avoid the errno=2 errors
> from the sm btl. (It is fixed in 1.3.2 and trunk)

[It does cause other trouble, but I forget what the exact behaviour was
when I lost it as a default.]

>> Yes, but there's a problem with the recommended (as far as I remember)
>> setup, with one slot per node to ensure a single job per node. In that
>> case, you have no control over allocation -- -bynode and -byslot are
>> equivalent, which apparently can badly affect some codes. We're
>> currently using a starter to generate a hosts file for that reason
                     ^^^^^^^

I meant queue prologue, not pe starter method.

>> (complicated by having dual- and quad-core nodes) and would welcome a
>> better idea.
>>
> I am not sure what you are asking here. Are you trying to get a
> single MPI process per node? You could use -npernode 1. Sorry for my
> confusion.

No. It's an SGE issue, not an Open MPI one, but to try to explain
anyhow: People normally want to ensure that a partially-full node
running an MPI job doesn't get anything else scheduled on it. E.g. on
8-core nodes, if you submit a 16-process job, there are four cores left
over on the relevant nodes which might get something else scheduled on
them. Using one slot per node avoids that, but means generating your
own hosts file if you want -bynode and -byslot not to be equivalent.