Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] How to specify hosts for MPI_Comm_spawn
From: Ralph Castain (rhc_at_[hidden])
Date: 2008-07-30 09:12:53


Singleton comm_spawn works fine on the 1.3 release branch - if
singleton comm_spawn is critical to your plans, I suggest moving to
that version. You can get a pre-release version off of the www.open-mpi.org
  web site.

On Jul 30, 2008, at 6:58 AM, Ralph Castain wrote:

> As your own tests have shown, it works fine if you just "mpirun -n
> 1 ./spawner". It is only singleton comm_spawn that appears to be
> having a problem in the latest 1.2 release. So I don't think
> comm_spawn is "useless". ;-)
>
> I'm checking this morning to ensure that singletons properly spawns
> on other nodes in the 1.3 release. I sincerely doubt we will
> backport a fix to 1.2.
>
>
> On Jul 30, 2008, at 6:49 AM, Mark Borgerding wrote:
>
>> I keep checking my email in hopes that someone will come up with
>> something that Matt or I might've missed.
>> I'm just having a hard time accepting that something so fundamental
>> would be so broken.
>> The MPI_Comm_spawn command is essentially useless without the
>> ability to spawn processes on other nodes.
>>
>> If this is true, then my personal scorecard reads:
>> # Days spent using openmpi: 4 (off and on)
>> # identified bugs in openmpi :2
>> # useful programs built: 0
>>
>> Please prove me wrong. I'm eager to be shown my ignorance -- to
>> find out where I've been stupid and what documentation I should've
>> read.
>>
>>
>> Matt Hughes wrote:
>>> I've found that I always have to use mpirun to start my spawner
>>> process, due to the exact problem you are having: the need to give
>>> OMPI a hosts file! It seems the singleton functionality is lacking
>>> somehow... it won't allow you to spawn on arbitrary hosts. I have
>>> not
>>> tested if this is fixed in the 1.3 series.
>>>
>>> Try
>>> mpiexec -np 1 -H op2-1,op2-2 spawner op2-2
>>>
>>> mpiexec should start the first process on op2-1, and the spawn call
>>> should start the second on op2-2. If you don't use the Info
>>> object to
>>> set the hostname specifically, then on 1.2.x it will automatically
>>> start on op2-2. With 1.3, the spawn call will start processes
>>> starting with the first item in the host list.
>>>
>>> mch
>>
>> [snip]
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users