Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Problems with gridengine integration on RHEL 6
From: Reuti (reuti_at_[hidden])
Date: 2012-02-15 17:20:46


Am 15.02.2012 um 22:59 schrieb Brian McNally:

> For for responding so quickly Reuti!
>
> To be clear my RHEL 5 and RHEL 6 nodes are part of the same cluster. In the RHEL 5 case qrsh -inherit gets called via mpirun. In the RHEL 6 case /usr/bin/ssh gets called directly from mpirun. The cluster setup looks like:
>
> qlogin_command /usr/local/bin/qlogin_command
> qlogin_daemon /usr/sbin/sshd -i
> rlogin_command builtin
> rsh_command builtin
> rsh_daemon builtin
>
> I don't seem to have a "rlogin_daemon" set.

It will use the former rlogin method, but if you don't use rlogin it doesn't matter and even would not work, as rlogin_command doesn't match.

(Just for reference: http://arc.liv.ac.uk/SGE/htmlman/htmlman5/remote_startup.html)

Are there any local configurations:

$ qconf -sconfl

-- Reuti

> --
> Brian McNally
>
> On 02/15/2012 01:43 PM, Reuti wrote:
>> Hi,
>>
>> Am 15.02.2012 um 22:21 schrieb Brian McNally:
>>
>>> Hello Open MPI community,
>>>
>>> I'm running the openmpi 1.5.3 package as provided by Redhat Enterprise Linux 6, along with SGE 6.2u3. I've discovered that under RHEL 5 orted gets spawned via qrsh and under RHEL 6 orted gets spanwed via SSH. This is happening in the same cluster environment with the same parallel environment setup. I want orted to get spawned via qrsh because we impose memory limits if a job is spawned through SSH.
>>
>> Is it spawned by SSH directly or as a result of `qrsh -inherit ...`, while:
>>
>> $ qconf -sconf
>> ...
>> qlogin_command builtin
>> qlogin_daemon builtin
>> rlogin_command builtin
>> rlogin_daemon builtin
>> rsh_command builtin
>> rsh_daemon builtin
>>
>> is set in the old cluster but different in the new one (i.e. pointing to SSH there)?
>>
>> -- Reuti
>>
>>
>>> I cannot determine WHY the behavior is different from RHEL 5 to RHEL 6. In the former I'm using the openmpi 1.4.3 package, in the latter I'm using openmpi 1.5.3. Both are supposedly built to support the gridengine ras.
>>>
>>> MCA ras: gridengine (MCA v2.0, API v2.0, Component v1.4.3)
>>> MCA ras: gridengine (MCA v2.0, API v2.0, Component v1.5.3)
>>>
>>> Any thoughts? The documentation indicates that "Open MPI will automatically detect when it is running inside SGE and will just 'do the Right Thing.'" In my case that isn't the case!
>>>
>>> --
>>> Brian McNally
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users