Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Fwd: [GE users] Open MPI job fails when run thru SGE
From: Reuti (reuti_at_[hidden])
Date: 2009-02-02 06:12:14


Am 02.02.2009 um 11:31 schrieb Sangamesh B:

> On Mon, Feb 2, 2009 at 12:15 PM, Reuti <reuti_at_[hidden]>
> wrote:
>> Am 02.02.2009 um 05:44 schrieb Sangamesh B:
>>
>>> On Sun, Feb 1, 2009 at 10:37 PM, Reuti <reuti_at_staff.uni-
>>> marburg.de> wrote:
>>>>
>>>> Am 01.02.2009 um 16:00 schrieb Sangamesh B:
>>>>
>>>>> On Sat, Jan 31, 2009 at 6:27 PM, Reuti <reuti_at_staff.uni-
>>>>> marburg.de>
>>>>> wrote:
>>>>>>
>>>>>> Am 31.01.2009 um 08:49 schrieb Sangamesh B:
>>>>>>
>>>>>>> On Fri, Jan 30, 2009 at 10:20 PM, Reuti <reuti_at_staff.uni-
>>>>>>> marburg.de>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Am 30.01.2009 um 15:02 schrieb Sangamesh B:
>>>>>>>>
>>>>>>>>> Dear Open MPI,
>>>>>>>>>
>>>>>>>>> Do you have a solution for the following problem of Open
>>>>>>>>> MPI (1.3)
>>>>>>>>> when run through Grid Engine.
>>>>>>>>>
>>>>>>>>> I changed global execd params with H_MEMORYLOCKED=infinity and
>>>>>>>>> restarted the sgeexecd in all nodes.
>>>>>>>>>
>>>>>>>>> But still the problem persists:
>>>>>>>>>
>>>>>>>>> $cat err.77.CPMD-OMPI
>>>>>>>>> ssh_exchange_identification: Connection closed by remote host
>>>>>>>>
>>>>>>>> I think this might already be the reason why it's not
>>>>>>>> working. A
>>>>>>>> mpihello
>>>>>>>> program is running fine through SGE?
>>>>>>>>
>>>>>>> No.
>>>>>>>
>>>>>>> Any Open MPI parallel job thru SGE runs only if its running on a
>>>>>>> single node (i.e. 8processes on 8 cores of a single node). If
>>>>>>> number
>>>>>>> of processes is more than 8, then SGE will schedule it on 2
>>>>>>> nodes -
>>>>>>> the job will fail with the above error.
>>>>>>>
>>>>>>> Now I did a loose integration of Open MPI 1.3 with SGE. The
>>>>>>> job runs,
>>>>>>> but all 16 processes run on a single node.
>>>>>>
>>>>>> What are the entries in `qconf -sconf`for:
>>>>>>
>>>>>> rsh_command
>>>>>> rsh_daemon
>>>>>>
>>>>> $ qconf -sconf
>>>>> global:
>>>>> execd_spool_dir /opt/gridengine/default/spool
>>>>> ...
>>>>> .....
>>>>> qrsh_command /usr/bin/ssh
>>>>> rsh_command /usr/bin/ssh
>>>>> rlogin_command /usr/bin/ssh
>>>>> rsh_daemon /usr/sbin/sshd
>>>>> qrsh_daemon /usr/sbin/sshd
>>>>> reprioritize 0
>>>>
>>>> Do you must use ssh? Often in a private cluster the rsh based
>>>> one is ok,
>>>> or
>>>> with SGE 6.2 the built-in mechanism of SGE. Otherwise please
>>>> follow this:
>>>>
>>>> http://gridengine.sunsource.net/howto/qrsh_qlogin_ssh.html
>>>>
>>>>
>>>>> I think its better to check once with Open MPI 1.2.8
>>>>>
>>>>>> What is your mpirun command in the jobscript - you are getting
>>>>>> there
>>>>>> the
>>>>>> mpirun from Open MPI? According to the output below, it's not
>>>>>> a loose
>>>>>> integration, but you prepare alraedy a machinefile, which is
>>>>>> superfluous
>>>>>> for
>>>>>> Open MPI.
>>>>>>
>>>>> No. I've not prepared the machinefile for Open MPI.
>>>>> For Tight integartion job:
>>>>>
>>>>> /opt/mpi/openmpi/1.3/intel/bin/mpirun -np $NSLOTS
>>>>> $CPMDBIN/cpmd311-ompi-mkl.x wf1.in $PP_LIBRARY >
>>>>> wf1.out_OMPI$NSLOTS.$JOB_ID
>>>>>
>>>>> For loose integration job:
>>>>>
>>>>> /opt/mpi/openmpi/1.3/intel/bin/mpirun -np $NSLOTS -hostfile
>>>>> $TMPDIR/machines $CPMDBIN/cpmd311-ompi-mkl.x wf1.in
>>>>> $PP_LIBRARY >
>>>>> wf1.out_OMPI_$JOB_ID.$NSLOTS
>>>>
>>>> a) you compiled Open MPI with "--with-sge"?
>>>>
>>> Yes. But ompi_info shows only one component of sge
>>>
>>> $ /opt/mpi/openmpi/1.3/intel/bin/ompi_info | grep gridengine
>>> MCA ras: gridengine (MCA v2.0, API v2.0,
>>> Component v1.3)
>>>
>>>> b) when the $SGE_ROOT variable is set, Open MPI will use a Tight
>>>> Integration
>>>> automatically.
>>>>
>>> In SGE job submit script, I set SGE_ROOT= <nothing>
>>
>> This will set the variable to an empty string. You need to use:
>>
>> unset SGE_ROOT
>>
> Right.
> I used 'unset SGE_ROOT' in the job submission script. Its working now.
> Hello world jobs are working now. (single & multiple nodes)
>
> Thank you for the help.
>
> What can be the problem with tight integration?

There are obviously two issues for now with the Tight Integration for
SGE:

- Some processes might throw an "err=2" for unknown reason and only
from time to time, but run fine.

- Processes vanish into daemon although SGE's qrsh is used
automatically (successive `ps -e f`show that it's called with "...
orted --daemonize ..." for a short while) - this I overlooked in my
last post when I stated it's working, as my process allocation was
fine. Only that they weren't bound to any sge_shepherd.

Seems SGE integration is broken, and it would be indeed better to
stay with 1.2.8 for now :-/

-- Reuti