Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] mpirun oddity w/ PBS on an SGI UV
From: Ralph Castain (rhc_at_[hidden])
Date: 2014-01-31 17:47:27


We read the nodes from the PBS_NODEFILE, Paul - can you pass that along?

On Jan 31, 2014, at 2:33 PM, Paul Hargrove <phhargrove_at_[hidden]> wrote:

> I am trying to test the trunk on an SGI UV (to validate Nathan's port of btl:vader to SGI's variant of xpmem).
>
> At configure time, PBS's TM support was correctly located.
>
> My PBS batch script includes
> #PBS -l ncpus=16
> because that is what this installation requires (not nodes, mppnodes, or anything like that).
> One is allocating cpus on a large shared-memory machine, not a set of nodes in a cluster.
>
> However, this appears to be causing mpirun to think I have just 1 slot:
>
> + mpirun -np 2 ./ring_c
> --------------------------------------------------------------------------
> There are not enough slots available in the system to satisfy the 2 slots
> that were requested by the application:
> ./ring_c
>
> Either request fewer slots for your application, or make more slots available
> for use.
> --------------------------------------------------------------------------
>
> In case they contain useful info, here are the PBS env vars in the job:
>
> PBS_HT_NCPUS=32
> PBS_VERSION=TORQUE-2.3.13
> PBS_JOBNAME=qs
> PBS_ENVIRONMENT=PBS_BATCH
> PBS_HOME=/var/spool/torque
> PBS_O_WORKDIR=/usr/users/6/hargrove/SCRATCH/OMPI/openmpi-trunk-linux-x86_64-uv-trunk/BLD/examples
> PBS_PPN=16
> PBS_TASKNUM=1
> PBS_O_HOME=/usr/users/6/hargrove
> PBS_MOMPORT=15003
> PBS_O_QUEUE=debug
> PBS_O_LOGNAME=hargrove
> PBS_O_LANG=en_US.UTF-8
> PBS_JOBCOOKIE=9EEF5DF75FA705A241FEF66EDFE01C5B
> PBS_NODENUM=0
> PBS_O_SHELL=/usr/psc/shells/bash
> PBS_SERVER=tg-login1.blacklight.psc.teragrid.org
> PBS_JOBID=314827.tg-login1.blacklight.psc.teragrid.org
> PBS_NCPUS=16
> PBS_O_HOST=tg-login1.blacklight.psc.teragrid.org
> PBS_VNODENUM=0
> PBS_QUEUE=debug_r1
> PBS_O_MAIL=/var/mail/hargrove
> PBS_NODEFILE=/var/spool/torque/aux//314827.tg-login1.blacklight.psc.teragrid.org
> PBS_O_PATH=[...removed...]
>
> If any additional info is needed to help make mpirun "just work", please let me know.
>
> However, at this point I am mostly interested in any work-arounds that will let me run something other than a singleton on this system.
>
> -Paul
>
> --
> Paul H. Hargrove PHHargrove_at_[hidden]
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel