Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Trouble with SGE integration
From: Reuti (reuti_at_[hidden])
Date: 2009-11-30 13:03:58


Am 30.11.2009 um 18:46 schrieb Ondrej Glembek:

> Hi, thanx for reply...
>
> I tried to dump the $@ before calling the exec and here it is:
>
>
> ( test ! -r ./.profile || . ./.profile; PATH=/homes/kazi/glembek/
> share/openmpi-1.3.3-64/bin:$PATH ; export PATH ; LD_LIBRARY_PATH=/
> homes/kazi/glembek/share/openmpi-1.3.3-64/lib:$LD_LIBRARY_PATH ;
> export LD_LIBRARY_PATH ; /homes/kazi/glembek/share/openmpi-1.3.3-64/
> bin/orted -mca ess env -mca orte_ess_jobid 3870359552 -mca
> orte_ess_vpid 1 -mca orte_ess_num_procs 2 --hnp-uri
> "3870359552.0;tcp://147.229.8.134:53727" --mca
> pls_gridengine_verbose 1 --output-filename mpi.log )
>
>
> It looks like the line gets constructed in orte/mca/plm/rsh/
> plm_rsh_module.c and depends on the shell...
>
> Still I wonder, why mpiexec calls the starter.sh... I thought the
> starter was supposed to call the script which wraps a call to
> mpiexec...

Correct. This will happen for the master node of this job, i.e. where
the jobscript is executed. But it will also be used for the qrsh -
inherit calls. I wonder about one thing: I see only a call to "orted"
and not the above sub-shell on my machines. Did you compile Open MPI
with --with-sge?

The original call above would be "ssh node_xy ( test ! ....)" which
seems working for ssh and rsh.

Just one note: with the starter script you will lose the set PATH and
LD_LIBRARY_PATH, as a new shell is created. It might be necessary to
set it again in your starter method.

-- Reuti

>
> Am I not right???
> Ondrej
>
>
> Reuti wrote:
>> Hi,
>> Am 30.11.2009 um 16:33 schrieb Ondrej Glembek:
>>> we are using a custom starter method in our SGE to launch our
>>> jobs... It
>>> looks something like this:
>>>
>>> #!/bin/sh
>>>
>>> # ... we do whole bunch of stuff here
>>>
>>> #start the job in thus shell
>>> exec "$@"
>> the "$@" should be replaced by the path to the jobscript (qsub) or
>> command (qrsh) plus the given options.
>> For the spread tasks to other nodes I get as argument: " orted -
>> mca ess env -mca orte_ess_jobid ...". Also no . ./.profile.
>> So I wonder, where the . ./.profile is coming from. Can you put a
>> `sleep 60` or alike before the `exec ...` and grep the built line
>> from `ps -e f` before it crashes?
>> -- Reuti
>>> The trouble is that mpiexec passes a command which looks like this:
>>>
>>> ( . ./.profile ..... )
>>>
>>> which, however, is not a valid exec argument...
>>>
>>> Is there any way to tell mpiexec to run it in a separate
>>> script??? Any
>>> idea how to solve this???
>>>
>>> Thanx
>>> Ondrej Glembek
>>>
>>> --
>>>
>>> Ondrej Glembek, PhD student E-mail: glembek_at_[hidden]
>>> UPGM FIT VUT Brno, L226 Web: http://www.fit.vutbr.cz/
>>> ~glembek
>>> Bozetechova 2, 612 66 Phone: +420 54114-1292
>>> Brno, Czech Republic Fax: +420 54114-1290
>>>
>>> ICQ: 93233896
>>> GPG: C050 A6DC 7291 6776 9B69 BB11 C033 D756 6F33 DE3C
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> --
>
> Ondrej Glembek, PhD student E-mail: glembek_at_[hidden]
> UPGM FIT VUT Brno, L226 Web: http://www.fit.vutbr.cz/
> ~glembek
> Bozetechova 2, 612 66 Phone: +420 54114-1292
> Brno, Czech Republic Fax: +420 54114-1290
>
> ICQ: 93233896
> GPG: C050 A6DC 7291 6776 9B69 BB11 C033 D756 6F33 DE3C
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users