Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Fwd: [GE users] Open MPI job fails when run thru SGE
From: Sangamesh B (forum.san_at_[hidden])
Date: 2009-02-01 10:00:30


On Sat, Jan 31, 2009 at 6:27 PM, Reuti <reuti_at_[hidden]> wrote:
> Am 31.01.2009 um 08:49 schrieb Sangamesh B:
>
>> On Fri, Jan 30, 2009 at 10:20 PM, Reuti <reuti_at_[hidden]>
>> wrote:
>>>
>>> Am 30.01.2009 um 15:02 schrieb Sangamesh B:
>>>
>>>> Dear Open MPI,
>>>>
>>>> Do you have a solution for the following problem of Open MPI (1.3)
>>>> when run through Grid Engine.
>>>>
>>>> I changed global execd params with H_MEMORYLOCKED=infinity and
>>>> restarted the sgeexecd in all nodes.
>>>>
>>>> But still the problem persists:
>>>>
>>>> $cat err.77.CPMD-OMPI
>>>> ssh_exchange_identification: Connection closed by remote host
>>>
>>> I think this might already be the reason why it's not working. A mpihello
>>> program is running fine through SGE?
>>>
>> No.
>>
>> Any Open MPI parallel job thru SGE runs only if its running on a
>> single node (i.e. 8processes on 8 cores of a single node). If number
>> of processes is more than 8, then SGE will schedule it on 2 nodes -
>> the job will fail with the above error.
>>
>> Now I did a loose integration of Open MPI 1.3 with SGE. The job runs,
>> but all 16 processes run on a single node.
>
> What are the entries in `qconf -sconf`for:
>
> rsh_command
> rsh_daemon
>
$ qconf -sconf
global:
execd_spool_dir /opt/gridengine/default/spool
...
.....
qrsh_command /usr/bin/ssh
rsh_command /usr/bin/ssh
rlogin_command /usr/bin/ssh
rsh_daemon /usr/sbin/sshd
qrsh_daemon /usr/sbin/sshd
reprioritize 0

I think its better to check once with Open MPI 1.2.8

> What is your mpirun command in the jobscript - you are getting there the
> mpirun from Open MPI? According to the output below, it's not a loose
> integration, but you prepare alraedy a machinefile, which is superfluous for
> Open MPI.
>
No. I've not prepared the machinefile for Open MPI.
For Tight integartion job:

/opt/mpi/openmpi/1.3/intel/bin/mpirun -np $NSLOTS
$CPMDBIN/cpmd311-ompi-mkl.x wf1.in $PP_LIBRARY >
wf1.out_OMPI$NSLOTS.$JOB_ID

For loose integration job:

/opt/mpi/openmpi/1.3/intel/bin/mpirun -np $NSLOTS -hostfile
$TMPDIR/machines $CPMDBIN/cpmd311-ompi-mkl.x wf1.in $PP_LIBRARY >
wf1.out_OMPI_$JOB_ID.$NSLOTS

I think I should check with Open MPI 1.2.8. That may work..

Thanks,
Sangamesh
>> $ cat out.83.Hello-OMPI
>> /opt/gridengine/default/spool/node-0-17/active_jobs/83.1/pe_hostfile
>> ibc17
>> ibc17
>> ibc17
>> ibc17
>> ibc17
>> ibc17
>> ibc17
>> ibc17
>> ibc12
>> ibc12
>> ibc12
>> ibc12
>> ibc12
>> ibc12
>> ibc12
>> ibc12
>> Greetings: 1 of 16 from the node node-0-17.local
>> Greetings: 10 of 16 from the node node-0-17.local
>> Greetings: 15 of 16 from the node node-0-17.local
>> Greetings: 9 of 16 from the node node-0-17.local
>> Greetings: 14 of 16 from the node node-0-17.local
>> Greetings: 8 of 16 from the node node-0-17.local
>> Greetings: 11 of 16 from the node node-0-17.local
>> Greetings: 12 of 16 from the node node-0-17.local
>> Greetings: 6 of 16 from the node node-0-17.local
>> Greetings: 0 of 16 from the node node-0-17.local
>> Greetings: 5 of 16 from the node node-0-17.local
>> Greetings: 3 of 16 from the node node-0-17.local
>> Greetings: 13 of 16 from the node node-0-17.local
>> Greetings: 4 of 16 from the node node-0-17.local
>> Greetings: 7 of 16 from the node node-0-17.local
>> Greetings: 2 of 16 from the node node-0-17.local
>>
>> But qhost -u <user name> shows that it is scheduled/running on two nodes.
>>
>> Any body successful in running Open MPI 1.3 tightly integrated with SGE?
>
> For a Tight Integration there's a FAQ:
>
> http://www.open-mpi.org/faq/?category=running#run-n1ge-or-sge
>
> -- Reuti
>
>>
>> Thanks,
>> Sangamesh
>>
>>> -- Reuti
>>>
>>>
>>>>
>>>> --------------------------------------------------------------------------
>>>> A daemon (pid 31947) died unexpectedly with status 129 while attempting
>>>> to launch so we are aborting.
>>>>
>>>> There may be more information reported by the environment (see above).
>>>>
>>>> This may be because the daemon was unable to find all the needed shared
>>>> libraries on the remote node. You may set your LD_LIBRARY_PATH to have
>>>> the
>>>> location of the shared libraries on the remote nodes and this will
>>>> automatically be forwarded to the remote nodes.
>>>>
>>>> --------------------------------------------------------------------------
>>>>
>>>> --------------------------------------------------------------------------
>>>> mpirun noticed that the job aborted, but has no info as to the process
>>>> that caused that situation.
>>>>
>>>> --------------------------------------------------------------------------
>>>> ssh_exchange_identification: Connection closed by remote host
>>>>
>>>> --------------------------------------------------------------------------
>>>> mpirun was unable to cleanly terminate the daemons on the nodes shown
>>>> below. Additional manual cleanup may be required - please refer to
>>>> the "orte-clean" tool for assistance.
>>>>
>>>> --------------------------------------------------------------------------
>>>> node-0-19.local - daemon did not report back when launched
>>>> node-0-20.local - daemon did not report back when launched
>>>> node-0-21.local - daemon did not report back when launched
>>>> node-0-22.local - daemon did not report back when launched
>>>>
>>>> The hostnames for infiniband interfaces are ibc0, ibc1, ibc2 .. ibc23.
>>>> May be Open MPI is not able to identify hosts as it is using node-0-..
>>>> . Is this causing open mpi to fail?
>>>>
>>>> Thanks,
>>>> Sangamesh
>>>>
>>>>
>>>> On Mon, Jan 26, 2009 at 5:09 PM, mihlon <vaclam1_at_[hidden]> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>>> Hello SGE users,
>>>>>>
>>>>>> The cluster is installed with Rocks-4.3, SGE 6.0 & Open MPI 1.3.
>>>>>> Open MPI is configured with "--with-sge".
>>>>>> ompi_info shows only one component:
>>>>>> # /opt/mpi/openmpi/1.3/intel/bin/ompi_info | grep gridengine
>>>>>> MCA ras: gridengine (MCA v2.0, API v2.0, Component v1.3)
>>>>>>
>>>>>> Is this acceptable?
>>>>>
>>>>> maybe yes
>>>>>
>>>>> see: http://www.open-mpi.org/faq/?category=building#build-rte-sge
>>>>>
>>>>> shell$ ompi_info | grep gridengine
>>>>> MCA ras: gridengine (MCA v1.0, API v1.3, Component v1.3)
>>>>> MCA pls: gridengine (MCA v1.0, API v1.3, Component v1.3)
>>>>>
>>>>> (Specific frameworks and version numbers may vary, depending on your
>>>>> version of Open MPI.)
>>>>>
>>>>>> The Open MPI parallel jobs run successfully through command line, but
>>>>>> fail when run thru SGE(with -pe orte <slots>).
>>>>>>
>>>>>> The error is:
>>>>>>
>>>>>> $ cat err.26.Helloworld-PRL
>>>>>> ssh_exchange_identification: Connection closed by remote host
>>>>>>
>>>>>>
>>>>>> --------------------------------------------------------------------------
>>>>>> A daemon (pid 8462) died unexpectedly with status 129 while attempting
>>>>>> to launch so we are aborting.
>>>>>>
>>>>>> There may be more information reported by the environment (see above).
>>>>>>
>>>>>> This may be because the daemon was unable to find all the needed
>>>>>> shared
>>>>>> libraries on the remote node. You may set your LD_LIBRARY_PATH to have
>>>>>> the
>>>>>> location of the shared libraries on the remote nodes and this will
>>>>>> automatically be forwarded to the remote nodes.
>>>>>>
>>>>>>
>>>>>> --------------------------------------------------------------------------
>>>>>>
>>>>>>
>>>>>> --------------------------------------------------------------------------
>>>>>> mpirun noticed that the job aborted, but has no info as to the process
>>>>>> that caused that situation.
>>>>>>
>>>>>>
>>>>>> --------------------------------------------------------------------------
>>>>>> mpirun: clean termination accomplished
>>>>>>
>>>>>> But the same job runs well, if it runs on a single node but with an
>>>>>> error:
>>>>>>
>>>>>> $ cat err.23.Helloworld-PRL
>>>>>> libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
>>>>>> This will severely limit memory registrations.
>>>>>> libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
>>>>>> This will severely limit memory registrations.
>>>>>> libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
>>>>>> This will severely limit memory registrations.
>>>>>> libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
>>>>>> This will severely limit memory registrations.
>>>>>> libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
>>>>>> This will severely limit memory registrations.
>>>>>>
>>>>>>
>>>>>> --------------------------------------------------------------------------
>>>>>> WARNING: There was an error initializing an OpenFabrics device.
>>>>>>
>>>>>> Local host: node-0-4.local
>>>>>> Local device: mthca0
>>>>>>
>>>>>>
>>>>>> --------------------------------------------------------------------------
>>>>>> libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
>>>>>> This will severely limit memory registrations.
>>>>>> libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
>>>>>> This will severely limit memory registrations.
>>>>>> libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
>>>>>> This will severely limit memory registrations.
>>>>>> [node-0-4.local:07869] 7 more processes have sent help message
>>>>>> help-mpi-btl-openib.txt / error in device init
>>>>>> [node-0-4.local:07869] Set MCA parameter "orte_base_help_aggregate" to
>>>>>> 0 to see all help / error messages
>>>>>>
>>>>>> The following link explains the same problem:
>>>>>>
>>>>>>
>>>>>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=72398
>>>>>>
>>>>>> With this reference, I put 'ulimit -l unlimited' into
>>>>>> /etc/init.d/sgeexecd in all nodes. Restarted the services.
>>>>>
>>>>> Do not set 'ulimit -l unlimited' in /etc/init.d/sgeexecd
>>>>> but set it in the SGE:
>>>>>
>>>>> Run qconf -mconf and set execd_params
>>>>>
>>>>>
>>>>> frontend$> qconf -sconf
>>>>> ...
>>>>> execd_params H_MEMORYLOCKED=infinity
>>>>> ...
>>>>>
>>>>>
>>>>> Then restart all your sgeexecd hosts.
>>>>>
>>>>>
>>>>> Milan
>>>>>
>>>>>> But still the problem persists.
>>>>>>
>>>>>> What could be the way out for this?
>>>>>>
>>>>>> Thanks,
>>>>>> Sangamesh
>>>>>>
>>>>>> ------------------------------------------------------
>>>>>>
>>>>>>
>>>>>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=99133
>>>>>>
>>>>>> To unsubscribe from this discussion, e-mail:
>>>>>> [users-unsubscribe_at_[hidden]].
>>>>>>
>>>>>
>>>>> ------------------------------------------------------
>>>>>
>>>>>
>>>>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=99461
>>>>>
>>>>> To unsubscribe from this discussion, e-mail:
>>>>> [users-unsubscribe_at_[hidden]].
>>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>