OK, so at the least, how can I get the node and slots/node info that is passed from PBS?
I ask because I'm trying to troubleshoot a problem w/ PBS and the build of OpenMPI 1.6 I noted. If I submit a 24-process simple job through PBS using a script which has:
/usr/local/openmpi/bin/orterun -n 24 --hostfile /home/sasso/TEST/hosts.file --mca orte_rsh_agent rsh --mca btl openib,tcp,self --mca orte_base_help_aggregate 0 -x PATH -x LD_LIBRARY_PATH /home/sasso/TEST/simplempihello.exe
And the hostfile /home/sasso/TEST/hosts.file contains 24 entries (the first 16 being host node0001 and the last 8 being node0002), it appears that 24 MPI tasks try to start on node0001 instead of getting distributed as 16 on node0001 and 8 on node0002. Hence, I am curious what is being passed by PBS.
From: users [mailto:users-bounces_at_[hidden]] On Behalf Of Ralph Castain
Sent: Friday, June 06, 2014 12:31 PM
To: Open MPI Users
Subject: Re: [OMPI users] Determining what parameters a scheduler passes to OpenMPI
We currently only get the node and slots/node info from PBS - we don't get any task placement info at all. We then use the mpirun cmd options and built-in mappers to map the tasks to the nodes.
I suppose we could do more integration in that regard, but haven't really seen a reason to do so - the OMPI mappers are generally more flexible than anything in the schedulers.
On Jun 6, 2014, at 9:08 AM, Sasso, John (GE Power & Water, Non-GE) <John1.Sasso_at_[hidden]<mailto:John1.Sasso_at_[hidden]>> wrote:
For the PBS scheduler and using a build of OpenMPI 1.6 built against PBS include files + libs, is there a way to determine (perhaps via some debugging flags passed to mpirun) what job placement parameters are passed from the PBS scheduler to OpenMPI? In particular, I am talking about task placement info such as nodes to place on, etc. Thanks!
users mailing list