I don't think the new built-in rsh in later versions of Grid Engine is
going to make any difference - the orted is the real starter of the
MPI tasks and should have a greater influence on the task environment.
However, it would help if you can record the nice values and resource
limits of each of the MPI task - you can easily do so by using a shell
wrapper like this one:
# resource limit
ulimit -a > /tmp/mpijob.$$
# nice value
ps -eo pid,user,nice,command | grep $$
# run real executable
<PATH to real executable>
Use mpirun to submit it as if it is the real MPI application - then
you can see if there are limits introduced by Grid Engine that are
Open Grid Scheduler / Grid Engine
Scalable Grid Engine Support Program
On Thu, Mar 15, 2012 at 12:28 AM, Joshua Baker-LePain <jlb17_at_[hidden]> wrote:
> On Thu, 15 Mar 2012 at 12:44am, Reuti wrote
>> Which version of SGE are you using? The traditional rsh startup was
>> replaced by the builtin startup some time ago (although it should still
> We're currently running the rather ancient 6.1u4 (due to the "If it ain't
> broke..." philosophy). The hardware for our new queue master recently
> arrived and I'll soon be upgrading to the most recent Open Grid Scheduler
> release. Are you saying that the upgrade with the new builtin startup
> method should avoid this problem?
>> Maybe this shows already the problem: there are two `qrsh -inherit`, as
>> Open MPI thinks these are different machines (I ran only with one slot on
>> each host hence didn't get it first but can reproduce it now). But for SGE
>> both may end up in the same queue overriding the openmpi-session in $TMPDIR.
>> Although it's running: you get all output? If I request 4 slots and get
>> one from each queue on both machines the mpihello outputs only 3 lines: the
>> "Hello World from Node 3" is always missing.
> I do seem to get all the output -- there are indeed 64 Hello World lines.
> Thanks again for all the help on this. This is one of the most productive
> exchanges I've had on a mailing list in far too long.
> Joshua Baker-LePain
> QB3 Shared Cluster Sysadmin
> users mailing list
Open Grid Scheduler - The Official Open Source Grid Engine