Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] why does --rankfile need hostlist?
From: Terry Dontje (Terry.Dontje_at_[hidden])
Date: 2009-06-22 06:30:07


Let us think about this some more. We'll try and reply later today.

--td

Ralph Castain wrote:
> Had a chance to think about how this might be done, and looked at it
> for awhile after getting home. I -think- I found a way to do it, but
> there are a couple of caveats:
>
> 1. Len's point about oversubscribing without warning would definitely
> hold true - this would positively be a "user beware" option
>
> 2. there could be no RM-provided allocation, hostfile, or -host
> options specified. Basically, I would be adding the "read rankfile"
> option to the end of the current allocation determination procedure
>
> I would still allow more procs than shown in the rankfile (mapping the
> rest bynode on the nodes specified in the rankfile - can't do byslot
> because I don't know how many slots are on each node), which means the
> only change in behavior would be the forced bynode mapping of
> unspecified procs.
>
> So use of this option will entail some risks and a slight difference
> in behavior, but would relieve you from the burden of having to
> provide a hostfile. I'm not personally convinced it is worth the risk
> and probable user complaints of "it didn't work", but since we don't
> use this option, I don't have a strong opinion on the matter.
>
> Let's just avoid going back-and-forth over wanting it, or how it
> should be implemented - let's get it all ironed out, and then
> implement it once, like we finally did at the end with the whole
> hostfile thing.
>
> Let me know if you want me to do this - it obviously isn't at the top
> of my priority list, but still could be done in the next few weeks.
>
> Ralph
>
>
> On Jun 21, 2009, at 9:00 AM, Lenny Verkhovsky wrote:
>
>> Sorry for the delay in response,
>> I totally agree with Ralph that it's not as easy as it seems,
>> 1. rankfile mapper uses already allocated machines ( by scheduler or
>> hostfile ), by using rankfile as a hostfile we can run into problem
>> where trying to use unallocated nodes, what can hang the run.
>> 2. we can't define in rankfile number of slots on each machine, which
>> means oversubscribing can take place without any warning.
>> 3. I personally dont see any problem using hostfile, even if it has
>> redundant info, hostfile and rankfile belong to different layers in
>> the system and solve different problems. The original hostfile ( if I
>> recall correctly ) could bind rank to the node, but the syntax wasn't
>> very flexible and clear.
>> Lenny.
>>
>> On Sun, Jun 21, 2009 at 5:15 PM, Ralph Castain <rhc_at_[hidden]
>> <mailto:rhc_at_[hidden]>> wrote:
>>
>> Let me suggest a two-step process, then:
>>
>> 1. let's change the error message as this is easily done and thus
>> can be done now
>>
>> 2. I can look at how to eat the rankfile as a hostfile. This may
>> not even be possible - the problem is that the entire system is
>> predicated on certain ordering due to our framework architecture.
>> So we get an allocation, and then do a mapping against that
>> allocation, filtering the allocation through hostfiles, -host,
>> and other options.
>>
>> By the time we reach the rankfile mapper, we have already
>> determined that we don't have an allocation and have to abort. It
>> is the rankfile mapper itself that looks for the -rankfile
>> option, so the system can have no knowledge that someone has
>> specified that option before that point - and thus, even if I
>> could parse the rankfile, I don't know it was given!
>>
>> What will take time is to figure out a way to either:
>>
>> (a) allow us to run the mapper even though we don't have any
>> nodes we know about, and allow the mapper to insert the nodes
>> itself - without causing non-rankfile uses to break (which could
>> be a major feat); or
>>
>> (b) have the overall system check for the rankfile option and
>> pass it as a hostfile as well, assuming that a hostfile wasn't
>> also given, no RM-based allocation exists, etc. - which breaks
>> our abstraction rules and also opens a possible can of worms.
>>
>> Either way, I also then have to teach the hostfile parser how to
>> realize it is a rankfile format and convert the info in it into
>> what we expected to receive from a hostfile - another non-trivial
>> problem.
>>
>> I'm willing to give it a try - just trying to make clear why my
>> response was negative. It isn't as simple as it sounds...which is
>> why Len and I didn't pursue it when this was originally developed.
>>
>> Ralph
>>
>>
>> On Sun, Jun 21, 2009 at 5:28 AM, Terry Dontje
>> <Terry.Dontje_at_[hidden] <mailto:Terry.Dontje_at_[hidden]>> wrote:
>>
>> Being a part of these discussions I can understand your
>> reticence to reopen this discussion. However, I think this
>> is a major usability issue with this feature which actually
>> is fairly important in order to get things to run performant.
>> Which IMO is important.
>>
>> That being said I think there are one of two things that
>> could be done to mitigate the issue.
>>
>> 1. To eliminate the element of surprise by changing mpirun
>> to eat rankfile without the hostfile.
>> 2. To change the error message to something understandable
>> by the user such that they
>> know they might be missing the hostfile option.
>>
>> Again I understand this topic is frustrating and there are
>> some boundaries with the design that make these two option
>> orthogonal to each other but I really believe we need to make
>> the rankfile option something that is easily usable by our users.
>>
>>
>> --td
>>
>> Ralph Castain wrote:
>>
>> Having gone around in circles on hostfile-related issues
>> for over five years now, I honestly have little
>> motivation to re-open the entire discussion again. It
>> doesn't seem to be that daunting a requirement for those
>> who are using it, so I'm inclined to just leave well
>> enough alone.
>>
>> :-)
>>
>>
>> On Fri, Jun 19, 2009 at 2:21 PM, Eugene Loh
>> <Eugene.Loh_at_[hidden] <mailto:Eugene.Loh_at_[hidden]>
>> <mailto:Eugene.Loh_at_[hidden] <mailto:Eugene.Loh_at_[hidden]>>>
>> wrote:
>>
>> Ralph Castain wrote:
>>
>> The two files have a slightly different format
>>
>> Agreed.
>>
>> and completely different meaning.
>>
>> Somewhat agreed. They're both related to mapping
>> processes onto a
>> cluster.
>>
>> The hostfile specifies how many slots are on a
>> node. The rankfile
>> specifies a rank and what node/slot it is to be
>> mapped onto.
>>
>> Agreed.
>>
>> Rankfiles can use relative node indexing and refer
>> to nodes
>> received from a resource manager - i.e., without
>> any hostfile.
>>
>> This is the main part I'm concerned about. E.g.,
>>
>> % cat rankfile
>> rank 0=node0 slot=0
>> rank 1=node1 slot=0
>> % mpirun -np 2 -rf rankfile ./a.out
>>
>> --------------------------------------------------------------------------
>> Rankfile claimed host node1 that was not allocated or
>> oversubscribed it's slots:
>>
>>
>> --------------------------------------------------------------------------
>> [node0:14611] [[61560,0],0] ORTE_ERROR_LOG: Bad
>> parameter in file
>> rmaps_rank_file.c at line 107
>> [node0:14611] [[61560,0],0] ORTE_ERROR_LOG: Bad
>> parameter in file
>> base/rmaps_base_map_job.c at line 86
>> [node0:14611] [[61560,0],0] ORTE_ERROR_LOG: Bad
>> parameter in file
>> base/plm_base_launch_support.c at line 86
>> [node0:14611] [[61560,0],0] ORTE_ERROR_LOG: Bad
>> parameter in file
>> plm_rsh_module.c at line 1016
>> % mpirun -np 2 -host node0,node1 -rf rankfile ./a.out
>> 0 on node0
>> 1 on node1
>> done
>>
>> It seems to me that the rankfile has sufficient
>> information to
>> express what I want it to do. But mpirun won't accept
>> this. To
>> fix this, I have to, e.g., supply/maintain/specify
>> redundant
>> information in a hostfile or host list.
>>
>> So the files are intentionally quite different.
>> Trying to combine
>> them would be rather ugly.
>>
>> Right. And my issue is that I'm forced to use both
>> when I only
>> want rankfile functionality.
>>
>> On Thu, Jun 18, 2009 at 1:52 PM, Eugene Loh
>> <Eugene.Loh_at_[hidden] <mailto:Eugene.Loh_at_[hidden]>
>> <mailto:Eugene.Loh_at_[hidden]
>> <mailto:Eugene.Loh_at_[hidden]>>> wrote:
>>
>> In order to use "mpirun --rankfile", I also
>> need to specify
>> hosts/hostlist. But that information is
>> redundant with what
>> I provide in the rankfile. So, from a user's
>> point of view,
>> this strikes me as broken. Yes? Should I
>> file a ticket, or
>> am I missing something here about this
>> functionality?
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden] <mailto:devel_at_[hidden]>
>> <mailto:devel_at_[hidden] <mailto:devel_at_[hidden]>>
>>
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>> ------------------------------------------------------------------------
>>
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden] <mailto:devel_at_[hidden]>
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden] <mailto:devel_at_[hidden]>
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden] <mailto:devel_at_[hidden]>
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden] <mailto:devel_at_[hidden]>
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>