Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Hostfile info argument with MPI_COMM_SPAWN in a Torque environment
From: Ralph Castain (rhc_at_[hidden])
Date: 2013-03-22 00:28:28


On Mar 21, 2013, at 1:35 PM, Sebastian Rinke <s.rinke_at_[hidden]> wrote:

> Dear all,
>
> I'm using OMPI 1.6.4 in a Torque-like environment.
> However, since there are modifications in Torque that prevent OMPI from spawning processes the way it does with MPI_COMM_SPAWN,

That hasn't been true in the past - did you folks locally modify Torque to prevent it?

> I want to circumvent Torque and use plain ssh only.
>
> So, I configured --without-tm and can successfully run mpiexec with -hostfile.
>
> Now I want to MPI_COMM_SPAWN using the hostfile info argument.
>
> I start with
>
> $ mpiexec -np 1 -hostfile hostfile_all ./spawn_parent
>
> where hostfile_all is a superset of hostfile_spawn which is provided in the info argument to MPI_COMM_SPAWN.
>
> The message I get is:
>
> --------------------------------------------------------------------------
> All nodes which are allocated for this job are already filled.
> --------------------------------------------------------------------------

I'll take a look in the morning when my cluster comes back up - sounds like we have a bug. However, note that there are no current plans for a 1.6.5 release, so I don't know how long it will be before any fix shows up.

Meantime, I'll check the 1.7 series to ensure it works correctly there as well.

>
> Any help on this is highly appreciated.
> Thank you.
>
> Sebastian
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel