Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Gridengine + Open MPI
From: Romaric David (david_at_[hidden])
Date: 2008-07-07 05:31:08

Pak Lui a écrit :
> It was fixed at one point in the trunk before v1.3 went official, but
> while rolling the code from gridengine PLM into the rsh PLM code, this
> feature was left out because there was some lingering issues that I
> didn't resolved and I lost track of it. Sorry but thanks for bringing it
> up, I will need to look at the issue again and reopen this ticket
> against v1.3:
Ok, so I have to wait for a 1.3 version to work with job suspend, or
will it be back-ported to 1.2.6 or 1.2.6 ?

> So even it is the rsh PLM that starts the parallel job under SGE, the
> rsh PLM can detect if the Open MPI job is started under the SGE Parallel
> Environment (via checking some SGE env vars) and use the "qrsh
> --inherit" command to launch the parallel job the same way as it was
> before. You can check by setting MCA to something like "--mca
> plm_base_verbose 10" in your mpirun command and look for the launch
> commands that mpirun uses.
It looks like shepherd cannot be started for a reason I couldn't get yet.
/opt/SGE/utilbin/lx24-amd64/rsh exited with exit code 0
reading exit code from shepherd ... 255
[hostname:16745] ----------------------------


   R. David - david_at_[hidden]
   Tel. : 03 90 24 45 48  (Fax 45 47)