Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] How to specify hosts for MPI_Comm_spawn
From: Ralph Castain (rhc_at_[hidden])
Date: 2008-07-29 14:11:44


OMPI doesn't care what your hosts are named - many of us use names
that have no numeric pattern or any other discernible pattern to them.

OMPI_MCA_rds_hostfile should point to a file that contains a list of
the hosts - have you ensured that it does, and that the hostfile
format is correct? Check the FAQ on the open-mpi.org site:

http://www.open-mpi.org/faq/?category=running#simple-spmd-run

There are several explanations there pertaining to hostfiles.

On Jul 29, 2008, at 11:57 AM, Mark Borgerding wrote:

> I listed the node names in the path named in ompi_info --param rds
> hostfile -- no luck.
> I also tried copying that file to another location and setting
> OMPI_MCA_rds_hostfile_path -- no luck.
>
> The remote hosts are named op2-1 and op2-2. Could this be another
> case of the problem I saw a few days ago where the hostnames were
> assumed to contain a numeric pattern?
>
> -- Mark
>
>
>
> Ralph Castain wrote:
>> For the 1.2 release, I believe you will find the enviro param is
>> OMPI_MCA_rds_hostfile_path - you can check that with "ompi_info".
>>
>>
>> On Jul 29, 2008, at 11:10 AM, Mark Borgerding wrote:
>>
>>> Umm ... what -hostfile file?
>>>
>>> I am not starting anything via mpiexec/orterun so there is no "-
>>> hostfile" argument AFAIK.
>>> Is there some other way to communicate this? An environment
>>> variable or mca param?
>>>
>>>
>>> -- Mark
>>>
>>>
>>> Ralph Castain wrote:
>>>> Are the hosts where you want the children to go in your -hostfile
>>>> file? All of the hosts you intend to use have to be in that file,
>>>> even if they don't get used until the comm_spawn.
>>>>
>>>>
>>>> On Jul 29, 2008, at 9:08 AM, Mark Borgerding wrote:
>>>>
>>>>> I've tried lots of different values for the "host" key in the
>>>>> info handle.
>>>>> I've tried hardcoding the hostname+ip entries in the /etc/hosts
>>>>> file -- no luck. I cannot get my MPI_Comm_spawn children to go
>>>>> anywhere else on the network.
>>>>>
>>>>> mpiexec can start groups on the other machines just fine. It
>>>>> seems like there is some initialization that is done by orterun
>>>>> but not by MPI_Comm_spawn.
>>>>>
>>>>> Is there a document that describes how the default process
>>>>> management works?
>>>>> I do not have infiniband, myrinet or any specialized rte, just
>>>>> ssh.
>>>>> All the machines are CentOS 5.2 (openmpi 1.2.5)
>>>>>
>>>>>
>>>>> -- Mark
>>>>>
>>>>> Ralph Castain wrote:
>>>>>> The string "localhost" may not be recognized in the 1.2 series
>>>>>> for comm_spawn. Do a "hostname" and use that string instead -
>>>>>> should work.
>>>>>>
>>>>>> Ralph
>>>>>>
>>>>>> On Jul 28, 2008, at 10:38 AM, Mark Borgerding wrote:
>>>>>>
>>>>>>> When I add the info parameter in MPI_Comm_spawn, I get the error
>>>>>>> "Some of the requested hosts are not included in the current
>>>>>>> allocation for the application:
>>>>>>> [...]
>>>>>>> Verify that you have mapped the allocated resources properly
>>>>>>> using the
>>>>>>> --host specification."
>>>>>>>
>>>>>>> Here is a snippet of my code that causes the error:
>>>>>>>
>>>>>>> MPI_Info info;
>>>>>>> MPI_Info_create( &info );
>>>>>>> MPI_Info_set(info,"host","localhost");
>>>>>>> MPI_Comm_spawn( cmd , MPI_ARGV_NULL , nkids , info , 0 ,
>>>>>>> MPI_COMM_SELF , &kid , errs );
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Mark Borgerding wrote:
>>>>>>>> Thanks, I don't know how I missed that. Perhaps I got thrown
>>>>>>>> off by
>>>>>>>> "Portable programs not requiring detailed control over
>>>>>>>> process locations should use MPI_INFO_NULL."
>>>>>>>>
>>>>>>>> If there were a computing equivalent of Maslow's Hierarchy of
>>>>>>>> Needs, functioning would be more fundamental than
>>>>>>>> portability :)
>>>>>>>>
>>>>>>>> -- Mark
>>>>>>>>
>>>>>>>>
>>>>>>>> Ralph Castain wrote:
>>>>>>>>> Take a look at the man page for MPI_Comm_spawn. It should
>>>>>>>>> explain that you need to create an MPI_Info key that has the
>>>>>>>>> key of "host" and a value that contains a comma-delimited
>>>>>>>>> list of hosts to be used for the child processes.
>>>>>>>>>
>>>>>>>>> Hope that helps
>>>>>>>>> Ralph
>>>>>>>>>
>>>>>>>>> On Jul 28, 2008, at 8:54 AM, Mark Borgerding wrote:
>>>>>>>>>
>>>>>>>>>> How does openmpi decide which hosts are used with
>>>>>>>>>> MPI_Comm_spawn? All the docs I've found talk about
>>>>>>>>>> specifying hosts on the mpiexec/mpirun command and so are
>>>>>>>>>> not applicable.
>>>>>>>>>> I am unable to spawn on anything but localhost (which makes
>>>>>>>>>> for a pretty uninteresting cluster).
>>>>>>>>>>
>>>>>>>>>> When I run
>>>>>>>>>> ompi_info --param rds hostfile
>>>>>>>>>> It reports MCA rds: parameter
>>>>>>>>>> "rds_hostfile_path" (current value: "/usr/lib/openmpi/1.2.5-
>>>>>>>>>> gcc/etc/openmpi-default-hostfile")
>>>>>>>>>> I tried changing that file but it has no effect.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I am using
>>>>>>>>>> openmpi 1.2.5
>>>>>>>>>> CentOS 5.2
>>>>>>>>>> ethernet TCP
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> -- Mark
>>>>>>>>>> _______________________________________________
>>>>>>>>>> users mailing list
>>>>>>>>>> users_at_[hidden]
>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> users_at_[hidden]
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Mark Borgerding
>>>>>>> 3dB Labs, Inc
>>>>>>> Innovate. Develop. Deliver.
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> users_at_[hidden]
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> users_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users