Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Torque and OpenMPI 1.2
From: Ralph H Castain (rhc_at_[hidden])
Date: 2007-12-19 16:34:36


Open MPI 1.3 will support use of the hostfile and the tm launcher
simultaneously. It will work slightly differently, though, with respect to
the hostfile:

1. PBS_NODEFILE will be read to obtain a complete list of what has been
allocated to us

2. you will be allowed to provide a hostfile for each app_context as a
separate entry to define the hosts to be used for that specific app_context.
The hosts in your hostfile, however, must be included in the PBS_NODEFILE.

Basically, the hostfile argument will serve as a filter to the hosts
provided via PBS_NODEFILE. We will use the TM launcher (unless, of course,
you tell us to do otherwise), so the issues I mentioned before will go away.

There will be a FAQ entry describing the revised hostfile behavior in some
detail. We think the change will help rationalize the behavior so it is more
consistent across all the different use-cases people have invented. ;-)

Hope that helps
Ralph

On 12/19/07 2:05 PM, "pat.o'bryant_at_[hidden]"
<pat.o'bryant_at_[hidden]> wrote:

> Ralph,
> Thanks for the information. I am assuming OpenMPI 1.3 will support
> the
> "-hostfile" without the extra parms. Will 1.3 also carry the same
> restrictions you list below?
> Pat
>
> J.W. (Pat) O'Bryant,Jr.
> Business Line Infrastructure
> Technical Systems, HPC
> Office: 713-431-7022
>
>
>
>
> Ralph H
> Castain
>
> <rhc_at_[hidden]> To
> Sent by: "Open MPI Users
> users-bounces@ <users_at_[hidden]>"
> open-
> mpi.org cc
> Ralph H Castain <rhc_at_[hidden]>
>
> Subject
> 12/19/07 10:10 Re: [OMPI users] Torque and
> OpenMPI
> AM 1.2
>
>
> Please respond
> to
> Open MPI Users
> <users_at_open-mp
> i.org>
>
>
>
>
>
>
>
>
> Just to be clear: what this does is tell Open MPI to launch using the
> SSH
> launcher. This will work okay, but means that Torque doesn't know
> about the
> children and cannot monitor them. It also won't work on clusters (such
> as
> the ones we have here) that do not allow you to ssh procs onto the
> backend
> nodes.
>
> If you are going this route, you actually don't need the --with-tm
> configure
> option. Your command line basically tells the system to ignore anything
> associated with tm anyway - you are operating just as if you were in an
> ssh-only cluster.
>
> If it works for you, that is great - just be aware of the limitations
> and
> disclaimers. I would only suggest it be used as a temporary workaround
> as
> opposed to a general practice.
>
> Ralph
>
>>
>>> From: "Caird, Andrew J" <acaird_at_[hidden]>
>>> Date: December 19, 2007 9:40:27 AM EST
>>> To: "Open MPI Users" <users_at_[hidden]>
>>> Subject: Re: [OMPI users] Torque and OpenMPI 1.2
>>> Reply-To: Open MPI Users <users_at_[hidden]>
>>>
>>>
>>> Glad to hear that worked for you.
>>>
>>> Full credit goes to Brock Palen who told me about this. It turns
>>> out we also have a user who wanted to do that. And meta-credit goes
>>> to the OMPI developers for making a consistent and flexible set of
>>> MPI tools and libraries.
>>>
>>> --andy
>>>
>>>
>>>> -----Original Message-----
>>>> From: users-bounces_at_[hidden]
>>>> [mailto:users-bounces_at_[hidden]] On Behalf Of
>>>> pat.o'bryant_at_[hidden]
>>>> Sent: Wednesday, December 19, 2007 9:37 AM
>>>> To: Open MPI Users
>>>> Subject: Re: [OMPI users] Torque and OpenMPI 1.2
>>>>
>>>> Andrew,
>>>> That worked like a champ. Now my users can have it both
>>>> ways. For the
>>>> record, my control statements looked like the following:
>>>>
>>>> /opt/openmpi-1.2.4/bin/mpirun -mca pls ^tm -np $NP -hostfile
>>>> $PBS_NODEFILE
>>>> $my_binary_path
>>>>
>>>> My job works just fine and reports no errors. This version of
>>>> OpenMPI was
>>>> built with "--with-tm=/usr/local/pbs".
>>>>
>>>> Thanks for your help,
>>>> Pat
>>>>
>>>>
>>>> J.W. (Pat) O'Bryant,Jr.
>>>> Business Line Infrastructure
>>>> Technical Systems, HPC
>>>> Office: 713-431-7022
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> "Caird, Andrew
>>>>
>>>> J"
>>>>
>>>> <acaird_at_umich.
>>>> To
>>>> edu> "Open MPI Users"
>>>>
>>>> Sent by: <users_at_[hidden]>
>>>>
>>>> users-bounces@
>>>> cc
>>>> open-mpi.org
>>>> <users-bounces_at_[hidden]>
>>>>
>>>> Subject
>>>> Re: [OMPI users] Torque
>>>> and OpenMPI
>>>> 12/19/07 07:59 1.2
>>>>
>>>> AM
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Please respond
>>>>
>>>> to
>>>>
>>>> Open MPI Users
>>>>
>>>> <users_at_open-mp
>>>>
>>>> i.org>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> oops, I meant -mca, not -mcs
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: users-bounces_at_[hidden]
>>>>> [mailto:users-bounces_at_[hidden]] On Behalf Of Caird, Andrew J
>>>>> Sent: Wednesday, December 19, 2007 8:57 AM
>>>>> To: Open MPI Users
>>>>> Cc: users-bounces_at_[hidden]
>>>>> Subject: Re: [OMPI users] Torque and OpenMPI 1.2
>>>>>
>>>>> Does OMPI built with TM but run with:
>>>>> -mcs pls ^tm
>>>>>
>>>>> give the same effect?
>>>>>
>>>>> --andy
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: users-bounces_at_[hidden]
>>>>>> [mailto:users-bounces_at_[hidden]] On Behalf Of
>>>>>> pat.o'bryant_at_[hidden]
>>>>>> Sent: Wednesday, December 19, 2007 8:47 AM
>>>>>> To: Open MPI Users
>>>>>> Cc: Open MPI Users; users-bounces_at_[hidden]
>>>>>> Subject: Re: [OMPI users] Torque and OpenMPI 1.2
>>>>>>
>>>>>> Terry,
>>>>>> Your suggestion worked. So long as I specifically state
>>>>>> "--without-tm",
>>>>>> the OpenMPI 1.2.4 build allows the use of "-hostfile".
>>>>> Apparently, by
>>>>>> default, OpenMPI 1.2.4 will incorporate Torque if it
>>>>> exists, so it is
>>>>>> necessary to specifically request "no Torque support". I
>>>>>> used the normal
>>>>>> Torque processes to submit the job and specified "-hostfile
>>>>>> $PBS_NODEFILE".
>>>>>> Everything worked.
>>>>>> Thanks for your help,
>>>>>> Pat
>>>>>>
>>>>>> J.W. (Pat) O'Bryant,Jr.
>>>>>> Business Line Infrastructure
>>>>>> Technical Systems, HPC
>>>>>> Office: 713-431-7022
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Terry
>>>>>>
>>>>>> Frankcombe
>>>>>>
>>>>>> <terry_at_[hidden]
>>>>>> To
>>>>>> .se> Open MPI Users
>>>>>> <users_at_[hidden]>
>>>>>> Sent by:
>>>>>> cc
>>>>>> users-bounces@
>>>>>>
>>>>>> open-mpi.org
>>>>>> Subject
>>>>>> Re: [OMPI users] Torque
>>>>>> and OpenMPI
>>>>>> 1.2
>>>>>>
>>>>>> 12/18/07 01:45
>>>>>>
>>>>>> PM
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Please respond
>>>>>>
>>>>>> to
>>>>>>
>>>>>> Open MPI Users
>>>>>>
>>>>>> <users_at_open-mp
>>>>>>
>>>>>> i.org>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, 2007-12-18 at 11:59 -0700, Ralph H Castain wrote:
>>>>>>> Hate to be a party-pooper, but the answer is "no" in
>>>>> OpenMPI 1.2. We
>>>>>> don't
>>>>>>> allow the use of a hostfile in a Torque environment in
>>>>> that version.
>>>>>>>
>>>>>>> We have changed this for v1.3, but you'll have to wait for
>>>>>> that release.
>>>>>>
>>>>>>
>>>>>> Can one not build OpenMPI without tm support and spawn remote
>>>>>> jobs using
>>>>>> the other mechanisms, using only $PBS_NODEFILE (or a
>>>>> derivative of the
>>>>>> file that that points to) in the script?
>>>>>>
>>>>>> Ciao
>>>>>> Terry
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Dr Terry Frankcombe
>>>>>> Physical Chemistry, Department of Chemistry
>>>>>> Göteborgs Universitet
>>>>>> SE-412 96 Göteborg Sweden
>>>>>> Ph: +46 76 224 0887 Skype: terry.frankcombe
>>>>>> <terry_at_[hidden]>
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> users_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> users_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users