Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] openmpi/pbsdsh/Torque problem
From: Laurence Marks (L-marks_at_[hidden])
Date: 2011-04-03 10:14:44


Let me expand on this slightly (in response to Ralph Castain's posting
-- I had digest mode set). As currently constructed a shellscript in
Wien2k (www.wien2k.at) launches a series of tasks using

($remote $remotemachine "cd $PWD;$t $ttt;rm -f .lock_$lockfile[$p]")
>>.time1_$loop &

where the standard setting for "remote" is "ssh", remotemachine is the
appropriate host, "t" is "time" and "ttt" is a concatenation of
commands, for instance when using 2 cores on one node for Task1, 2
cores on 2 nodes for Task2 and 2 cores on 1 node for Task3

Task1:
mpirun -v -x LD_LIBRARY_PATH -x PATH -np 2 -machinefile .machine1
/home/lma712/src/Virgin_10.1/lapw1Q_mpi lapw1Q_1.def
Task2:
mpirun -v -x LD_LIBRARY_PATH -x PATH -np 4 -machinefile .machine2
/home/lma712/src/Virgin_10.1/lapw1Q_mpi lapw1Q_2.def
Task3:
mpirun -v -x LD_LIBRARY_PATH -x PATH -np 2 -machinefile .machine3
/home/lma712/src/Virgin_10.1/lapw1Q_mpi lapw1Q_3.def

This is a stable script, works under SGI, linux, mvapich and many
others using ssh or rsh (although I've never myself used it with rsh).
It is general purpose, i.e. will work to run just 1 task on 8x8
nodes/cores or 8 parallel tasks on 8 nodes all with 8 cores or any
scatter of nodes/cores.

According to some, ssh is becoming obsolete within supercomputers and
the "replacement" is pbsdsh at least under Torque. Getting pbsdsh is
certainly not as simple as the documentation I've seen. To get it to
even partially work I am using for "remote" a script "pbsh" which
creates an executable bash file where HOME, PATH, LD_LIBRARY_PATH etc
as well as the PBS environmental variables listed at the bottom of
http://www.bear.bham.ac.uk/bluebear/pbsdsh.shtml plus PBS_NODEFILE to
a file $PBS_O_WORKDIR/.tmp_$1 followed by the relevant command and
then runs

pbsdsh -h $1 /bin/bash -lc " $PBS_O_WORKDIR/.tmp_$1 "

This works fine so long as Task2 does not have 2 nodes (probably 3 as
well, I've not tested this). If it does there is a communications
failure with nothing launched on the 2nd node of Task2.

I'm including the script below, as maybe there are some other
environmental variables needed or some should not be there in order to
properly rebuilt the environment so things will work. (And yes, I know
there should be tests to see if the variables are set first and so
forth and this is not so clean, this is just an initial version.)

----------
# Script to replace ssh by pbsdsh
# Beta version, April 2011, L. D. Marks
#
# Remove old file -- needed !
rm -f $PBS_O_WORKDIR/.tmp_$1

# Create a script that exports the environment we have
# This may not be enough
echo #!/bin/bash > $PBS_O_WORKDIR/.tmp_$1
echo source $HOME/.bashrc >> $PBS_O_WORKDIR/.tmp_$1
echo cd $PBS_O_WORKDIR >> $PBS_O_WORKDIR/.tmp_$1
echo export PATH=$PBS_O_PATH >> $PBS_O_WORKDIR/.tmp_$1
echo export TMPDIR=$TMPDIR >> $PBS_O_WORKDIR/.tmp_$1
echo export SCRATCH=$SCRATCH >> $PBS_O_WORKDIR/.tmp_$1
echo export LD_LIBRARY_PATH=$LD_LIBRARY_PATH >> $PBS_O_WORKDIR/.tmp_$1

# Openmpi needs to have this defined, even if we don't use it
echo export PBS_NODEFILE=$PBS_NODEFILE >> $PBS_O_WORKDIR/.tmp_$1
echo export PBS_ENVIRONMENT=$PBS_ENVIRONMENT >> $PBS_O_WORKDIR/.tmp_$1
echo export PBS_JOBCOOKIE=$PBS_JOBCOOKIE >> $PBS_O_WORKDIR/.tmp_$1
echo export PBS_JOBID=$PBS_JOBID >> $PBS_O_WORKDIR/.tmp_$1
echo export PBS_JOBNAME=$PBS_JOBNAME >> $PBS_O_WORKDIR/.tmp_$1
echo export PBS_MOMPORT=$PBS_MOMPORT >> $PBS_O_WORKDIR/.tmp_$1
echo export PBS_NODENUM=$PBS_NODENUM >> $PBS_O_WORKDIR/.tmp_$1
echo export PBS_O_HOME=$PBS_O_HOME >> $PBS_O_WORKDIR/.tmp_$1
echo export PBS_O_HOST=$PBS_O_HOST >> $PBS_O_WORKDIR/.tmp_$1
echo export PBS_O_LANG=$PBS_O_LANG >> $PBS_O_WORKDIR/.tmp_$1
echo export PBS_O_LOGNAME=$PBS_O_LOGNAME >> $PBS_O_WORKDIR/.tmp_$1
echo export PBS_O_MAIL=$PBS_O_MAIL >> $PBS_O_WORKDIR/.tmp_$1
echo export PBS_O_PATH=$PBS_O_PATH >> $PBS_O_WORKDIR/.tmp_$1
echo export PBS_O_QUEUE=$PBS_O_QUEUE >> $PBS_O_WORKDIR/.tmp_$1
echo export PBS_O_SHELL=$PBS_O_SHELL >> $PBS_O_WORKDIR/.tmp_$1
echo export PBS_O_WORKDIR=$PBS_O_WORKDIR >> $PBS_O_WORKDIR/.tmp_$1
echo export PBS_QUEUE=$PBS_QUEUE >> $PBS_O_WORKDIR/.tmp_$1
echo export PBS_TASKNUM=$PBS_TASKNUM >> $PBS_O_WORKDIR/.tmp_$1
echo export PBS_VNODENUM=$PBS_VNODENUM >> $PBS_O_WORKDIR/.tmp_$1

# Now the command we want to run
echo $2 >> $PBS_O_WORKDIR/.tmp_$1

# Make it executable
chmod a+x $PBS_O_WORKDIR/.tmp_$1

pbsdsh -h $1 /bin/bash -lc " $PBS_O_WORKDIR/.tmp_$1 "

#Cleanup if needed (commented out for debugging)
#rm $PBS_O_WORKDIR/.tmp_$1

On Sat, Apr 2, 2011 at 9:36 PM, Laurence Marks <L-marks_at_[hidden]> wrote:
> I have a problem which may or may not be openmpi, but since this list
> was useful before with a race condition I am posting.
>
> I am trying to use pbsdsh as a ssh replacement, pushed by sysadmins as
> Torque does not know about ssh tasks launched from a task. In a simple
> case, a script launches three mpi tasks in parallel,
>
> Task1: NodeA
> Task2: NodeB and NodeC
> Task3: NodeD
>
> (some cores on each, all handled correctly). Reproducible (but with
> different nodes and numbers of cores) Task1 and Task3 work fine, the
> mpi task starts on NodeB but nothing starts on NodeC, it appears that
> NodeC does not communicate. It does not have to be this it could be
>
> Task1: NodeA NodeB
> Task2: NodeC NodeD
>
> Here NodeC will start and it looks as if NodeD never starts anything.
> I've also run it with 4 Tasks (1,3,4 work) and if Task2 only uses one
> Node (number of cores do not matter) it is fine.
>
> --
> Laurence Marks
> Department of Materials Science and Engineering
> MSE Rm 2036 Cook Hall
> 2220 N Campus Drive
> Northwestern University
> Evanston, IL 60208, USA
> Tel: (847) 491-3996 Fax: (847) 491-7820
> email: L-marks at northwestern dot edu
> Web: www.numis.northwestern.edu
> Chair, Commission on Electron Crystallography of IUCR
> www.numis.northwestern.edu/
> Research is to see what everybody else has seen, and to think what
> nobody else has thought
> Albert Szent-Györgi
>

-- 
Laurence Marks
Department of Materials Science and Engineering
MSE Rm 2036 Cook Hall
2220 N Campus Drive
Northwestern University
Evanston, IL 60208, USA
Tel: (847) 491-3996 Fax: (847) 491-7820
email: L-marks at northwestern dot edu
Web: www.numis.northwestern.edu
Chair, Commission on Electron Crystallography of IUCR
www.numis.northwestern.edu/
Research is to see what everybody else has seen, and to think what
nobody else has thought
Albert Szent-Györgi