Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] MPI_Comm_spawn under Torque
From: Ralph Castain (rhc_at_[hidden])
Date: 2014-02-20 21:44:30


Hmmm...I don't see anything immediately glaring. What do you mean by "doesn't work"? Is there some specific behavior you see?

You might try the attached program. It's a simple spawn test we use - 1.7.4 seems happy with it.


On Feb 20, 2014, at 10:14 AM, Suraj Prabhakaran <suraj.prabhakaran_at_[hidden]> wrote:

> I am using 1.7.4!
>
> On Feb 20, 2014, at 7:00 PM, Ralph Castain wrote:
>
>> What OMPI version are you using?
>>
>> On Feb 20, 2014, at 7:56 AM, Suraj Prabhakaran <suraj.prabhakaran_at_[hidden]> wrote:
>>
>>> Hello!
>>>
>>> I am having problem using MPI_Comm_spawn under torque. It doesn't work when spawning more than 12 processes on various nodes. To be more precise, "sometimes" it works, and "sometimes" it doesn't!
>>>
>>> Here is my case. I obtain 5 nodes, 3 cores per node and my $PBS_NODEFILE looks like below.
>>>
>>> node1
>>> node1
>>> node1
>>> node2
>>> node2
>>> node2
>>> node3
>>> node3
>>> node3
>>> node4
>>> node4
>>> node4
>>> node5
>>> node5
>>> node5
>>>
>>> I started a hello program (which just spawns itself and of course, the children don't spawn), with
>>>
>>> mpiexec -np 3 ./hello
>>>
>>> Spawning 3 more processes (on node 2) - works!
>>> spawning 6 more processes (node 2 and 3) - works!
>>> spawning 9 processes (node 2,3,4) - "sometimes" OK, "sometimes" not!
>>> spawning 12 processes (node 2,3,4,5) - "mostly" not!
>>>
>>> I ideally want to spawn about 32 processes with large number of nodes, but this is at the moment impossible. I have attached my hello program to this email.
>>>
>>> I will be happy to provide any more info or verbose outputs if you could please tell me what exactly you would like to see.
>>>
>>> Best,
>>> Suraj
>>>
>>> <hello.c>_______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel