This web mail archive is frozen.
This page is part of a frozen web archive of this mailing list.
You can still navigate around this archive, but know that no new mails
have been added to it since July of 2016.
Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.
I'm trying to create a tight integration between torque and openmpi for cases
where the tm ras and plm isn't compiled into openmpi. This scenario is
common for linux distros that ship openmpi. Of course the ideal solution is
to recompile openmpi with torque support, but this isn't always feasible since
I do not want to support my own version of openmpi on the stuff I'm
distributing to others.
We also see some proprietary applications shipping their own embedded openmpi
libraries where either tm plm/ras is missing or non-functional with the torque
installation on our system.
So, I've come so far as to create a pbsdshwrapper.py that mimics ssh behaviour
very closely so that starting the orteds on all the hosts works as expected
and the application starts correctly when I use
setenv OMPI_MCA_plm_rsh_agent "pbsdshwrapper.py"
mpirun --hostfile $PBS_NODEFILE ........
What I want now is a way to get rid of the --hostfile $PBS_NODEFILE in the
mpirun command. Is there an environment variable that I can set so that
mpirun grabs the right nodelist?
By spelunking the code I find that the rsh plm has support for SGE where it
automatically picks up the PE_NODEFILE if it detects that it is launched
within an SGE job. Would it be possible to have the same functionality for
torque? The code looks a bit too complex at first sight for me to fix this
The Computer Center, University of Tromsø, N-9037 TROMSØ Norway.
phone:+47 77 64 41 07, fax:+47 77 64 41 00
Roy Dragseth, Team Leader, High Performance Computing
Direct call: +47 77 64 62 56. email: roy.dragseth_at_[hidden]