Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Trouble with SGE integration
From: Ondrej Glembek (glembek_at_[hidden])
Date: 2009-11-30 14:07:26


I definitely compiled the package with --with-sge flag... Here's my
configure log:

./configure --prefix=/homes/kazi/glembek/share/openmpi-1.3.3-64
--with-sge --enable-shared --enable-static --host=x86_64-linux
--build=x86_64-linux NM=x86_64-linux-nm

Just to mention one more interesting thing: when---by luck---sge
reserves the jobs on the same machine (aka smp scheme), all works with
no problem...

Is there any way to force the ssh before the (...) term???

Thanx
Ondrej

Reuti wrote:
> Am 30.11.2009 um 18:46 schrieb Ondrej Glembek:
>
>> Hi, thanx for reply...
>>
>> I tried to dump the $@ before calling the exec and here it is:
>>
>>
>> ( test ! -r ./.profile || . ./.profile;
>> PATH=/homes/kazi/glembek/share/openmpi-1.3.3-64/bin:$PATH ; export
>> PATH ;
>> LD_LIBRARY_PATH=/homes/kazi/glembek/share/openmpi-1.3.3-64/lib:$LD_LIBRARY_PATH
>> ; export LD_LIBRARY_PATH ;
>> /homes/kazi/glembek/share/openmpi-1.3.3-64/bin/orted -mca ess env -mca
>> orte_ess_jobid 3870359552 -mca orte_ess_vpid 1 -mca orte_ess_num_procs
>> 2 --hnp-uri "3870359552.0;tcp://147.229.8.134:53727" --mca
>> pls_gridengine_verbose 1 --output-filename mpi.log )
>>
>>
>> It looks like the line gets constructed in
>> orte/mca/plm/rsh/plm_rsh_module.c and depends on the shell...
>>
>> Still I wonder, why mpiexec calls the starter.sh... I thought the
>> starter was supposed to call the script which wraps a call to mpiexec...
>
> Correct. This will happen for the master node of this job, i.e. where
> the jobscript is executed. But it will also be used for the qrsh
> -inherit calls. I wonder about one thing: I see only a call to "orted"
> and not the above sub-shell on my machines. Did you compile Open MPI
> with --with-sge?
>
> The original call above would be "ssh node_xy ( test ! ....)" which
> seems working for ssh and rsh.
>
> Just one note: with the starter script you will lose the set PATH and
> LD_LIBRARY_PATH, as a new shell is created. It might be necessary to set
> it again in your starter method.
>
> -- Reuti
>
>
>>
>> Am I not right???
>> Ondrej
>>
>>
>> Reuti wrote:
>>> Hi,
>>> Am 30.11.2009 um 16:33 schrieb Ondrej Glembek:
>>>> we are using a custom starter method in our SGE to launch our
>>>> jobs... It
>>>> looks something like this:
>>>>
>>>> #!/bin/sh
>>>>
>>>> # ... we do whole bunch of stuff here
>>>>
>>>> #start the job in thus shell
>>>> exec "$@"
>>> the "$@" should be replaced by the path to the jobscript (qsub) or
>>> command (qrsh) plus the given options.
>>> For the spread tasks to other nodes I get as argument: " orted -mca
>>> ess env -mca orte_ess_jobid ...". Also no . ./.profile.
>>> So I wonder, where the . ./.profile is coming from. Can you put a
>>> `sleep 60` or alike before the `exec ...` and grep the built line
>>> from `ps -e f` before it crashes?
>>> -- Reuti
>>>> The trouble is that mpiexec passes a command which looks like this:
>>>>
>>>> ( . ./.profile ..... )
>>>>
>>>> which, however, is not a valid exec argument...
>>>>
>>>> Is there any way to tell mpiexec to run it in a separate script??? Any
>>>> idea how to solve this???
>>>>
>>>> Thanx
>>>> Ondrej Glembek
>>>>
>>>> --
>>>>
>>>> Ondrej Glembek, PhD student E-mail: glembek_at_[hidden]
>>>> UPGM FIT VUT Brno, L226 Web: http://www.fit.vutbr.cz/~glembek
>>>> Bozetechova 2, 612 66 Phone: +420 54114-1292
>>>> Brno, Czech Republic Fax: +420 54114-1290
>>>>
>>>> ICQ: 93233896
>>>> GPG: C050 A6DC 7291 6776 9B69 BB11 C033 D756 6F33 DE3C
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> --
>>
>> Ondrej Glembek, PhD student E-mail: glembek_at_[hidden]
>> UPGM FIT VUT Brno, L226 Web: http://www.fit.vutbr.cz/~glembek
>> Bozetechova 2, 612 66 Phone: +420 54114-1292
>> Brno, Czech Republic Fax: +420 54114-1290
>>
>> ICQ: 93233896
>> GPG: C050 A6DC 7291 6776 9B69 BB11 C033 D756 6F33 DE3C
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
   Ondrej Glembek, PhD student  E-mail: glembek_at_[hidden]
   UPGM FIT VUT Brno, L226      Web:    http://www.fit.vutbr.cz/~glembek
   Bozetechova 2, 612 66        Phone:  +420 54114-1292
   Brno, Czech Republic         Fax:    +420 54114-1290
   ICQ: 93233896
   GPG: C050 A6DC 7291 6776 9B69 BB11 C033 D756 6F33 DE3C