Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] openmpi+sge
From: Jaime Perea (jaime.perea_at_[hidden])
Date: 2008-10-02 10:51:48


Hi

builtin, do I have to change them to ssh and sshd as in sge 6.1?

Thanks again

--
Jaime Perea
El Jueves, 2 de Octubre de 2008, Reuti escribió:
> Am 02.10.2008 um 16:12 schrieb Jaime Perea:
> > Hi again, thanks for the answer
> >
> > Actually I took the definition of the pe from the openmpi
> > webpage, in my case
> >
> > qconf -sp orte
> > pe_name            orte
> > slots              24
> > user_lists         NONE
> > xuser_lists        NONE
> > start_proc_args    /bin/true
> > stop_proc_args     /bin/true
> > allocation_rule    $round_robin
> > control_slaves     TRUE
> > job_is_first_task  TRUE
> > urgency_slots      min
> > accounting_summary FALSE
> >
> > Our sge is version 6.2 and openmpi was configured with
> > the --with-sge switch of course.
>
> In SGE 6.2 two types of remote startup are implemented. Which one are
> you using (builtin or the former settings for each command) in the
> SGE configuration?
>
> -- Reuti
>
> > Regards
> >
> > --
> > Jaime Perea
> >
> > El Jueves, 2 de Octubre de 2008, Reuti escribió:
> >> Hi,
> >>
> >> Am 02.10.2008 um 15:37 schrieb Jaime Perea:
> >>> Hello,
> >>>
> >>> I am having some problems with a combination of openmpi+sge6.2
> >>>
> >>> Currently I'm working with the 1.3a1r19666 openmpi release and the
> >>
> >> AFAIK, you have to enable SGE support in Open MPI 1.3 during its
> >> compilation.
> >>
> >>> myrinet gm libraries (2.1.19)  but the problem was the same with the
> >>> prior 1.3 version. In short, I'm able to send jobs to a que via
> >>> qrsh,
> >>> more or less this way,
> >>>
> >>> qrsh -cwd -V -q para -pe orte 6 mpirun -np 6 ctiming
> >>
> >> It should also work without specifying the number of slots a second
> >> time, i.e.:
> >>
> >> qrsh -cwd -V -q para -pe orte 6 mpirun ctiming
> >>
> >>> ctiming is a small test program and in this way it works, but if I
> >>> try to
> >>> send the same task by using qsub on a script like this one
> >>>
> >>> #!/bin/sh
> >>> #$ -pe orte 6
> >>
> >> This PE has just /bin/true for start-/stop_proc_args?
> >>
> >>> #$ -q para
> >>> #$ -cwd
> >>> #
> >>> mpirun -np $NSLOTS  /model/jaime/ctiming
> >>
> >> mpirun /model/jaime/ctiming
> >>
> >>> It fails with a message like this,
> >>> ..............
> >>>
> >>> error reading job context from "qlogin_starter"
> >>
> >> qlogin_starter should of course only be started with a qlogin command
> >> in SGE.
> >>
> >>> --------------------------------------------------------------------
> >>> --
> >>> ----
> >>> A daemon (pid 11207) died unexpectedly with status 1 while
> >>> attempting
> >>> to launch so we are aborting.
> >>>
> >>> There may be more information reported by the environment (see
> >>> above).
> >>>
> >>> This may be because the daemon was unable to find all the needed
> >>> shared
> >>> libraries on the remote node. You may set your LD_LIBRARY_PATH to
> >>> have the
> >>> location of the shared libraries on the remote nodes and this will
> >>> automatically be forwarded to the remote nodes.
> >>>
> >>> .............
> >>>
> >>> I know that LD_LIBRARY_PATH is not the problem,  since I checked
> >>> that all
> >>> the environment is present.... any idea?
> >>>
> >>> For previous releases of the sge and openmpi I was able to do them
> >>> work
> >>> together with a few wrappers,
> >>
> >> Which version of SGE are you using?
> >>
> >> -- Reuti
> >>
> >>> but now the integration looks much better!
> >>> This happen only when sending openmpi jobs.
> >>>
> >>> Thanks and all the best
> >>>
> >>> ---
> >>>
> >>>            Jaime D. Perea Duarte. <jaime at iaa dot es>
> >>>              Linux registered user #10472
> >>>
> >>>            Dep. Astrofisica Extragalactica.
> >>>            Instituto de Astrofisica de Andalucia (CSIC)
> >>>            Apdo. 3004, 18080 Granada, Spain.
> >>> _______________________________________________
> >>> users mailing list
> >>> users_at_[hidden]
> >>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >> _______________________________________________
> >> users mailing list
> >> users_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users