Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] How to specify hosts for MPI_Comm_spawn
From: Mark Borgerding (markb_at_[hidden])
Date: 2008-07-29 13:57:41


I listed the node names in the path named in ompi_info --param rds
hostfile -- no luck.
I also tried copying that file to another location and setting
OMPI_MCA_rds_hostfile_path -- no luck.

The remote hosts are named op2-1 and op2-2. Could this be another case
of the problem I saw a few days ago where the hostnames were assumed to
contain a numeric pattern?

-- Mark

Ralph Castain wrote:
> For the 1.2 release, I believe you will find the enviro param is
> OMPI_MCA_rds_hostfile_path - you can check that with "ompi_info".
>
>
> On Jul 29, 2008, at 11:10 AM, Mark Borgerding wrote:
>
>> Umm ... what -hostfile file?
>>
>> I am not starting anything via mpiexec/orterun so there is no
>> "-hostfile" argument AFAIK.
>> Is there some other way to communicate this? An environment variable
>> or mca param?
>>
>>
>> -- Mark
>>
>>
>> Ralph Castain wrote:
>>> Are the hosts where you want the children to go in your -hostfile
>>> file? All of the hosts you intend to use have to be in that file,
>>> even if they don't get used until the comm_spawn.
>>>
>>>
>>> On Jul 29, 2008, at 9:08 AM, Mark Borgerding wrote:
>>>
>>>> I've tried lots of different values for the "host" key in the info
>>>> handle.
>>>> I've tried hardcoding the hostname+ip entries in the /etc/hosts
>>>> file -- no luck. I cannot get my MPI_Comm_spawn children to go
>>>> anywhere else on the network.
>>>>
>>>> mpiexec can start groups on the other machines just fine. It seems
>>>> like there is some initialization that is done by orterun but not
>>>> by MPI_Comm_spawn.
>>>>
>>>> Is there a document that describes how the default process
>>>> management works?
>>>> I do not have infiniband, myrinet or any specialized rte, just ssh.
>>>> All the machines are CentOS 5.2 (openmpi 1.2.5)
>>>>
>>>>
>>>> -- Mark
>>>>
>>>> Ralph Castain wrote:
>>>>> The string "localhost" may not be recognized in the 1.2 series for
>>>>> comm_spawn. Do a "hostname" and use that string instead - should
>>>>> work.
>>>>>
>>>>> Ralph
>>>>>
>>>>> On Jul 28, 2008, at 10:38 AM, Mark Borgerding wrote:
>>>>>
>>>>>> When I add the info parameter in MPI_Comm_spawn, I get the error
>>>>>> "Some of the requested hosts are not included in the current
>>>>>> allocation for the application:
>>>>>> [...]
>>>>>> Verify that you have mapped the allocated resources properly
>>>>>> using the
>>>>>> --host specification."
>>>>>>
>>>>>> Here is a snippet of my code that causes the error:
>>>>>>
>>>>>> MPI_Info info;
>>>>>> MPI_Info_create( &info );
>>>>>> MPI_Info_set(info,"host","localhost");
>>>>>> MPI_Comm_spawn( cmd , MPI_ARGV_NULL , nkids , info , 0 ,
>>>>>> MPI_COMM_SELF , &kid , errs );
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Mark Borgerding wrote:
>>>>>>> Thanks, I don't know how I missed that. Perhaps I got thrown off by
>>>>>>> "Portable programs not requiring detailed control over process
>>>>>>> locations should use MPI_INFO_NULL."
>>>>>>>
>>>>>>> If there were a computing equivalent of Maslow's Hierarchy of
>>>>>>> Needs, functioning would be more fundamental than portability :)
>>>>>>>
>>>>>>> -- Mark
>>>>>>>
>>>>>>>
>>>>>>> Ralph Castain wrote:
>>>>>>>> Take a look at the man page for MPI_Comm_spawn. It should
>>>>>>>> explain that you need to create an MPI_Info key that has the
>>>>>>>> key of "host" and a value that contains a comma-delimited list
>>>>>>>> of hosts to be used for the child processes.
>>>>>>>>
>>>>>>>> Hope that helps
>>>>>>>> Ralph
>>>>>>>>
>>>>>>>> On Jul 28, 2008, at 8:54 AM, Mark Borgerding wrote:
>>>>>>>>
>>>>>>>>> How does openmpi decide which hosts are used with
>>>>>>>>> MPI_Comm_spawn? All the docs I've found talk about specifying
>>>>>>>>> hosts on the mpiexec/mpirun command and so are not applicable.
>>>>>>>>> I am unable to spawn on anything but localhost (which makes
>>>>>>>>> for a pretty uninteresting cluster).
>>>>>>>>>
>>>>>>>>> When I run
>>>>>>>>> ompi_info --param rds hostfile
>>>>>>>>> It reports MCA rds: parameter
>>>>>>>>> "rds_hostfile_path" (current value:
>>>>>>>>> "/usr/lib/openmpi/1.2.5-gcc/etc/openmpi-default-hostfile")
>>>>>>>>> I tried changing that file but it has no effect.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I am using
>>>>>>>>> openmpi 1.2.5
>>>>>>>>> CentOS 5.2
>>>>>>>>> ethernet TCP
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -- Mark
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> users_at_[hidden]
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> users_at_[hidden]
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Mark Borgerding
>>>>>> 3dB Labs, Inc
>>>>>> Innovate. Develop. Deliver.
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> users_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users