Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Strange behaviour of SGE+OpenMPI
From: Rolf Vandevaart (Rolf.Vandevaart_at_[hidden])
Date: 2009-03-31 17:03:51


On 03/31/09 14:50, Dave Love wrote:
> Rolf Vandevaart <Rolf.Vandevaart_at_[hidden]> writes:
>
>>> However, I found that if I explicitly specify the "-machinefile
>>> $TMPDIR/machines", all 8 mpi processes were spawned within a single
>>> node, i.e. node0002.
>
> I had that sort of behaviour recently when the tight integration was
> broken on the installation we'd been given, and it took me a long time
> to spot. [Is the orte_leave_session_attached fix relevant here?]
No, orte_leave_session_attached is needed to avoid the errno=2 errors
from the sm btl. (It is fixed in 1.3.2 and trunk)
>
>> And for what it is worth, as you have seen,
>> you do not need to specify a machines file. Open MPI will use the
>> ones that were allocated by SGE.
>
> Yes, but there's a problem with the recommended (as far as I remember)
> setup, with one slot per node to ensure a single job per node. In that
> case, you have no control over allocation -- -bynode and -byslot are
> equivalent, which apparently can badly affect some codes. We're
> currently using a starter to generate a hosts file for that reason
> (complicated by having dual- and quad-core nodes) and would welcome a
> better idea.
>
I am not sure what you are asking here. Are you trying to get a single
MPI process per node? You could use -npernode 1. Sorry for my confusion.

Rolf

-- 
=========================
rolf.vandevaart_at_[hidden]
781-442-3043
=========================