Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI devel] mpirun oddity w/ PBS on an SGI UV
From: Paul Hargrove (phhargrove_at_[hidden])
Date: 2014-02-06 12:47:26


Ralph,

It worked on my second try, when I spelled it "ras_tm_smp" :-)

Thanks,
-Paul

On Wed, Feb 5, 2014 at 11:59 AM, Paul Hargrove <phhargrove_at_[hidden]> wrote:

> Ralph,
>
> I will try to build tonight's trunk tarball and then test a run tomorrow.
> Please ping me if I don't post my results by Thu evening (PST).
>
> -Paul
>
>
> On Wed, Feb 5, 2014 at 7:52 AM, Ralph Castain <rhc_at_[hidden]> wrote:
>
>> I added this to the trunk in r30568 - a new MCA param "ras_tm_smp_mode"
>> will tell us to use the PBS_PPN envar to get the number of slots allocated
>> per node. We then just use the PBS_Nodefile to read the names of the nodes,
>> which I expect will be one for each partition.
>>
>> Let me know if this solves the problem - I scheduled it for 1.7.5
>>
>> Thanks!
>> Ralph
>>
>> On Jan 31, 2014, at 4:33 PM, Ralph Castain <rhc_at_[hidden]> wrote:
>>
>> No worries about PBS itself - better to allow you to just run this way.
>> Easy to add a switch for this purpose.
>>
>> For now, just add --oversubscribe to the command line
>>
>> On Jan 31, 2014, at 3:32 PM, Paul Hargrove <phhargrove_at_[hidden]> wrote:
>>
>> Ralph,
>>
>> The mods may have been done by the staff at PSC rather than by SGI.
>> Note the "_psc" suffix:
>> $ which pbsnodes
>> /usr/local/packages/torque/2.3.13_psc/bin/pbsnodes
>>
>> Their sources appear to be available in the f/s too.
>> Using "tar -d" to compare that to the pristine torque-2.3.13 tarball show
>> the following files were modified:
>> torque-2.3.13/src/resmom/job_func.c
>> torque-2.3.13/src/resmom/mom_main.c
>> torque-2.3.13/src/resmom/requests.c
>> torque-2.3.13/src/resmom/linux/mom_mach.h
>> torque-2.3.13/src/resmom/linux/mom_mach.c
>> torque-2.3.13/src/resmom/linux/cpuset.c
>> torque-2.3.13/src/resmom/start_exec.c
>> torque-2.3.13/src/scheduler.tcl/pbs_sched.c
>> torque-2.3.13/src/cmds/qalter.c
>> torque-2.3.13/src/cmds/qsub.c
>> torque-2.3.13/src/cmds/qstat.c
>> torque-2.3.13/src/server/resc_def_all.c
>> torque-2.3.13/src/server/req_quejob.c
>> torque-2.3.13/torque.spec
>>
>> I'll provide what assistance I can in testing.
>> That includes providing (off-list) the actual diffs of PSC's torque
>> against the tarball, if desired.
>>
>> In the meantime, since -npernode didn't work, what is the right way to
>> say:
>> "I have 1 slot but I want to overcommit and run 16 mpi ranks".
>>
>> -Paul
>>
>>
>> On Fri, Jan 31, 2014 at 3:20 PM, Ralph Castain <rhc_at_[hidden]> wrote:
>>
>>>
>>> On Jan 31, 2014, at 3:13 PM, Paul Hargrove <phhargrove_at_[hidden]> wrote:
>>>
>>> Ralph,
>>>
>>> As I said this is NOT a cluster - it is a 4k-core shared memory machine.
>>>
>>>
>>> I understood - that wasn't the nature of my question
>>>
>>> TORQUE is allocating cpus (time-shared mode, IIRC), not nodes.
>>> So, there is always exactly one line in $PBS_NODESFILE.
>>>
>>>
>>> Interesting - because that isn't the standard way Torque behaves. It is
>>> supposed to put one line/slot in the nodefile, each line containing the
>>> name of the node. Clearly, SGI has reconfigured Torque to do something
>>> different.
>>>
>>>
>>> The system runs as 2 partitions of 2k-cores each.
>>> So, the contents odf$PBS_NODESFILE has exactly 2 possible values, each 1
>>> line.
>>>
>>> The values of PBS_PPN and PBS_NCPUS both reflect the size of the
>>> allocation.
>>>
>>> At a minimum, shouldn't Open MPI be multiplying the lines in
>>> $PBS_NODESFILE by the value of $PBS_PPN?
>>>
>>>
>>> No, as above, that isn't the way Torque generally behaves. It would
>>> appear that we need a "switch" here to handle SGI's modifications. Should
>>> be doable - just haven't had anyone using an SGI machine before :-)
>>>
>>>
>>> Additionally, when I try "mpirun -npernode 16 ./ring_c" I am still told
>>> there are not enough slots.
>>> Shouldn't that be working with 1 line is $PBS_NODESFILE?
>>>
>>> -Paul
>>>
>>>
>>>
>>>
>>> On Fri, Jan 31, 2014 at 2:47 PM, Ralph Castain <rhc_at_[hidden]> wrote:
>>>
>>>> We read the nodes from the PBS_NODEFILE, Paul - can you pass that along?
>>>>
>>>> On Jan 31, 2014, at 2:33 PM, Paul Hargrove <phhargrove_at_[hidden]> wrote:
>>>>
>>>> I am trying to test the trunk on an SGI UV (to validate Nathan's port
>>>> of btl:vader to SGI's variant of xpmem).
>>>>
>>>> At configure time, PBS's TM support was correctly located.
>>>>
>>>> My PBS batch script includes
>>>> #PBS -l ncpus=16
>>>> because that is what this installation requires (not nodes, mppnodes,
>>>> or anything like that).
>>>> One is allocating cpus on a large shared-memory machine, not a set of
>>>> nodes in a cluster.
>>>>
>>>> However, this appears to be causing mpirun to think I have just 1 slot:
>>>>
>>>> + mpirun -np 2 ./ring_c
>>>>
>>>> --------------------------------------------------------------------------
>>>> There are not enough slots available in the system to satisfy the 2
>>>> slots
>>>> that were requested by the application:
>>>> ./ring_c
>>>>
>>>> Either request fewer slots for your application, or make more slots
>>>> available
>>>> for use.
>>>>
>>>> --------------------------------------------------------------------------
>>>>
>>>> In case they contain useful info, here are the PBS env vars in the job:
>>>>
>>>> PBS_HT_NCPUS=32
>>>> PBS_VERSION=TORQUE-2.3.13
>>>> PBS_JOBNAME=qs
>>>> PBS_ENVIRONMENT=PBS_BATCH
>>>> PBS_HOME=/var/spool/torque
>>>>
>>>> PBS_O_WORKDIR=/usr/users/6/hargrove/SCRATCH/OMPI/openmpi-trunk-linux-x86_64-uv-trunk/BLD/examples
>>>> PBS_PPN=16
>>>> PBS_TASKNUM=1
>>>> PBS_O_HOME=/usr/users/6/hargrove
>>>> PBS_MOMPORT=15003
>>>> PBS_O_QUEUE=debug
>>>> PBS_O_LOGNAME=hargrove
>>>> PBS_O_LANG=en_US.UTF-8
>>>> PBS_JOBCOOKIE=9EEF5DF75FA705A241FEF66EDFE01C5B
>>>> PBS_NODENUM=0
>>>> PBS_O_SHELL=/usr/psc/shells/bash
>>>> PBS_SERVER=tg-login1.blacklight.psc.teragrid.org
>>>> PBS_JOBID=314827.tg-login1.blacklight.psc.teragrid.org
>>>> PBS_NCPUS=16
>>>> PBS_O_HOST=tg-login1.blacklight.psc.teragrid.org
>>>> PBS_VNODENUM=0
>>>> PBS_QUEUE=debug_r1
>>>> PBS_O_MAIL=/var/mail/hargrove
>>>> PBS_NODEFILE=/var/spool/torque/aux//
>>>> 314827.tg-login1.blacklight.psc.teragrid.org
>>>> PBS_O_PATH=[...removed...]
>>>>
>>>> If any additional info is needed to help make mpirun "just work",
>>>> please let me know.
>>>>
>>>> However, at this point I am mostly interested in any work-arounds that
>>>> will let me run something other than a singleton on this system.
>>>>
>>>> -Paul
>>>>
>>>> --
>>>> Paul H. Hargrove PHHargrove_at_[hidden]
>>>> Future Technologies Group
>>>> Computer and Data Sciences Department Tel: +1-510-495-2352
>>>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>
>>>
>>>
>>> --
>>> Paul H. Hargrove PHHargrove_at_[hidden]
>>> Future Technologies Group
>>> Computer and Data Sciences Department Tel: +1-510-495-2352
>>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>>
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>
>>
>>
>> --
>> Paul H. Hargrove PHHargrove_at_[hidden]
>> Future Technologies Group
>> Computer and Data Sciences Department Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>
>
>
> --
> Paul H. Hargrove PHHargrove_at_[hidden]
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>

-- 
Paul H. Hargrove                          PHHargrove_at_[hidden]
Future Technologies Group
Computer and Data Sciences Department     Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900