Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] Determining what parameters a scheduler passes to OpenMPI
From: Gus Correa (gus_at_[hidden])
Date: 2014-06-06 13:24:08

On 06/06/2014 01:05 PM, Ralph Castain wrote:
> You can always add --display-allocation to the cmd line to see what we
> thought we received.
> If you configure OMPI with --enable-debug, you can set --mca
> ras_base_verbose 10 to see the details

Hi John

On the Torque side, you can put a line "cat $PBS_NODEFILE" on the job
script. This will list the nodes (multiple times according to the
number of cores requested).
I find this useful documentation,
along with job number, work directory, etc.
"man qsub" will show you all the PBS_* environment variables
available to the job.
For instance, you can echo them using a Torque
'prolog' script, if the user
didn't do it. That will appear in the Torque STDOUT file.

 From outside the job script, "qstat -n" (and variants, say, with -u
will list the nodes allocated to each job,
again multiple times as per the requested cores.

"tracejob job_number" will show similar information.

If you configured Torque --with-cpuset,
there is more information about the cpuset allocated to the job
in /dev/cpuset/torque/jobnumber (on the first node listed above, called
"mother superior" in Torque parlance).
This mostly matter if there is more than one job running on a node.
However, Torque doesn't bind processes/MPI_ranks to cores or sockets or
whatever. As Ralph said, Open MPI does that.
I believe Open MPI doesn't use the cpuset info from Torque.
(Ralph, please correct me if I am wrong.)

My two cents,
Gus Correa

> On Jun 6, 2014, at 10:01 AM, Reuti <reuti_at_[hidden]
> <mailto:reuti_at_[hidden]>> wrote:
>> Am 06.06.2014 um 18:58 schrieb Sasso, John (GE Power & Water, Non-GE):
>>> OK, so at the least, how can I get the node and slots/node info that
>>> is passed from PBS?
>>> I ask because I’m trying to troubleshoot a problem w/ PBS and the
>>> build of OpenMPI 1.6 I noted. If I submit a 24-process simple job
>>> through PBS using a script which has:
>>> /usr/local/openmpi/bin/orterun -n 24 --hostfile
>>> /home/sasso/TEST/hosts.file --mca orte_rsh_agent rsh --mca btl
>>> openib,tcp,self --mca orte_base_help_aggregate 0 -x PATH -x
>>> LD_LIBRARY_PATH /home/sasso/TEST/simplempihello.exe
>> Using the --hostfile on your own would mean to violate the granted
>> slot allocation by PBS. Just leave this option out. How do you submit
>> your job?
>> -- Reuti
>>> And the hostfile /home/sasso/TEST/hosts.file contains 24 entries (the
>>> first 16 being host node0001 and the last 8 being node0002), it
>>> appears that 24 MPI tasks try to start on node0001 instead of getting
>>> distributed as 16 on node0001 and 8 on node0002. Hence, I am
>>> curious what is being passed by PBS.
>>> --john
>>> From: users [mailto:users-bounces_at_[hidden]] On Behalf Of Ralph
>>> Castain
>>> Sent: Friday, June 06, 2014 12:31 PM
>>> To: Open MPI Users
>>> Subject: Re: [OMPI users] Determining what parameters a scheduler
>>> passes to OpenMPI
>>> We currently only get the node and slots/node info from PBS - we
>>> don't get any task placement info at all. We then use the mpirun cmd
>>> options and built-in mappers to map the tasks to the nodes.
>>> I suppose we could do more integration in that regard, but haven't
>>> really seen a reason to do so - the OMPI mappers are generally more
>>> flexible than anything in the schedulers.
>>> On Jun 6, 2014, at 9:08 AM, Sasso, John (GE Power & Water, Non-GE)
>>> <John1.Sasso_at_[hidden] <mailto:John1.Sasso_at_[hidden]>> wrote:
>>> For the PBS scheduler and using a build of OpenMPI 1.6 built against
>>> PBS include files + libs, is there a way to determine (perhaps via
>>> some debugging flags passed to mpirun) what job placement parameters
>>> are passed from the PBS scheduler to OpenMPI? In particular, I am
>>> talking about task placement info such as nodes to place on, etc.
>>> Thanks!
>>> --john
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden] <mailto:users_at_[hidden]>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>> _______________________________________________
>> users mailing list
>> users_at_[hidden] <mailto:users_at_[hidden]>
> _______________________________________________
> users mailing list
> users_at_[hidden]