Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] torque pbs behaviour...
From: Gus Correa (gus_at_[hidden])
Date: 2009-08-10 16:01:48


Hi Jody

We don't have Mac OS-X, but Linux, not sure if this applies to you.

Did you configure your OpenMPI with Torque support,
and pointed to the same library that provides the
Torque you are using (--with-tm=/path/to/torque-library-directory)?

Are you using the right mpirun? (There are so many out there.)

Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------

Jody Klymak wrote:
>
> Hi All,
>
> I've been trying to get torque pbs to work on my OS X 10.5.7 cluster
> with openMPI (after finding that Xgrid was pretty flaky about
> connections). I *think* this is an MPI problem (perhaps via operator
> error!)
>
> If I submit openMPI with:
>
>
> #PBS -l nodes=2:ppn=8
>
> mpirun MyProg
>
>
> pbs locks off two of the processors, checked via "pbsnodes -a", and the
> job output. But mpirun runs the whole job on the second of the two
> processors.
>
> If I run the same job w/o qsub (i.e. using ssh)
> mpirun -n 16 -host xserve01,xserve02 MyProg
> it runs fine on all the nodes....
>
> My /var/spool/toque/server_priv/nodes file looks like:
>
> xserve01.local np=8
> xserve02.local np=8
>
>
> Any idea what could be going wrong or how to debu this properly? There
> is nothing suspicious in the server or mom logs.
>
> Thanks for any help,
>
> Jody
>
>
>
>
>
> --
> Jody Klymak
> http://web.uvic.ca/~jklymak/
>
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users