Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] -display-map
From: Greg Watson (g.watson_at_[hidden])
Date: 2009-01-16 12:58:17


When I try to build trunk, it fails with:

i_f77.lax/libmpi_f77_pmpi.a/pwin_unlock_f.o .libs/libmpi_f77.lax/
libmpi_f77_pmpi.a/pwin_wait_f.o .libs/libmpi_f77.lax/libmpi_f77_pmpi.a/
pwtick_f.o .libs/libmpi_f77.lax/libmpi_f77_pmpi.a/
pwtime_f.o ../../../ompi/.libs/libmpi.0.0.0.dylib /usr/local/
openmpi-1.4-devel/lib/libopen-rte.0.0.0.dylib /usr/local/openmpi-1.4-
devel/lib/libopen-pal.0.0.0.dylib -install_name /usr/local/
openmpi-1.4-devel/lib/libmpi_f77.0.dylib -compatibility_version 1 -
current_version 1.0
ld: duplicate symbol _mpi_reduce_local_f in .libs/libmpi_f77.lax/
libmpi_f77_pmpi.a/preduce_local_f.o and .libs/reduce_local_f.o

collect2: ld returned 1 exit status
make[3]: *** [libmpi_f77.la] Error 1
make[2]: *** [all-recursive] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all-recursive] Error 1

I'm using the default configure command (./configure --prefix=xxx) on
Mac OS X 10.5. This works fine on the 1.3 branch.

Greg

On Jan 15, 2009, at 1:13 PM, Ralph Castain wrote:

> Okay, it is in the trunk as of r20284 - I'll file the request to
> have it moved to 1.3.1.
>
> Let me know if you get a chance to test the stdout/err stuff in the
> trunk - we should try and iterate it so any changes can make 1.3.1
> as well.
>
> Thanks!
> Ralph
>
>
> On Jan 15, 2009, at 11:03 AM, Greg Watson wrote:
>
>> Ralph,
>>
>> I think the second form would be ideal and would simplify things
>> greatly.
>>
>> Greg
>>
>> On Jan 15, 2009, at 10:53 AM, Ralph Castain wrote:
>>
>>> Here is what I was able to do - note that the resolve messages are
>>> associated with the specific hostname, not the overall map:
>>>
>>> <map>
>>> <host name="graywolf54.lanl.gov" slots="1" max_slots="0">
>>> <noderesolve name="graywolf54.lanl.gov" resolved="localhost"/>
>>> <process rank="0"/>
>>> <process rank="1"/>
>>> <process rank="2"/>
>>> </host>
>>> </map>
>>>
>>> Will that work for you? If you like, I can remove the name= field
>>> from the noderesolve element since the info is specific to the
>>> host element that contains it. In other words, I can make it look
>>> like this:
>>>
>>> <map>
>>> <host name="graywolf54.lanl.gov" slots="1" max_slots="0">
>>> <noderesolve resolved="localhost"/>
>>> <process rank="0"/>
>>> <process rank="1"/>
>>> <process rank="2"/>
>>> </host>
>>> </map>
>>>
>>> if that would help.
>>>
>>> Ralph
>>>
>>>
>>> On Jan 14, 2009, at 7:57 AM, Ralph Castain wrote:
>>>
>>>> We -may- be able to do a more formal XML output at some point.
>>>> The problem will be the natural interleaving of stdout/err from
>>>> the various procs due to the async behavior of MPI. Mpirun
>>>> receives fragmented output in the forwarding system, limited by
>>>> the buffer sizes and the amount of data we can read at any one
>>>> "bite" from the pipes connecting us to the procs. So even though
>>>> the user -thinks- they output a single large line of stuff, it
>>>> may show up at mpirun as a series of fragments. Hence, it gets
>>>> tricky to know how to put appropriate XML brackets around it.
>>>>
>>>> Given this input about when you actually want resolved name info,
>>>> I can at least do something about that area. Won't be in 1.3.0,
>>>> but should make 1.3.1.
>>>>
>>>> As for XML-tagged stdout/err: the OMPI community asked me not to
>>>> turn that feature "on" for 1.3.0 as they felt it hasn't been
>>>> adequately tested yet. The code is present, but cannot be
>>>> activated in 1.3.0. However, I believe it is activated on the
>>>> trunk when you do --xml --tagged-output, so perhaps some testing
>>>> will help us debug and validate it adequately for 1.3.1?
>>>>
>>>> Thanks
>>>> Ralph
>>>>
>>>>
>>>> On Jan 14, 2009, at 7:02 AM, Greg Watson wrote:
>>>>
>>>>> Ralph,
>>>>>
>>>>> The only time we use the resolved names is when we get a map, so
>>>>> we consider them part of the map output.
>>>>>
>>>>> If quasi-XML is all that will ever be possible with 1.3, then
>>>>> you may as well leave as-is and we will attempt to clean it up
>>>>> in Eclipse. It would be nice if a future version of ompi could
>>>>> output correct XML (including stdout) as this would vastly
>>>>> simplify the parsing we need to do.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Greg
>>>>>
>>>>> On Jan 13, 2009, at 3:30 PM, Ralph Castain wrote:
>>>>>
>>>>>> Hmmm...well, I can't do either for 1.3.0 as it is departing
>>>>>> this afternoon.
>>>>>>
>>>>>> The first option would be very hard to do. I would have to
>>>>>> expose the display-map option across the code base and check it
>>>>>> prior to printing anything about resolving node names. I guess
>>>>>> I should ask: do you only want noderesolve statements when we
>>>>>> are displaying the map? Right now, I will output them regardless.
>>>>>>
>>>>>> The second option could be done. I could check if any "display"
>>>>>> option has been specified, and output the <ompi> root at that
>>>>>> time (likewise for the end). Anything we output in-between
>>>>>> would be encapsulated between the two, but that would include
>>>>>> any user output to stdout and/or stderr - which for 1.3.0 is
>>>>>> not in xml.
>>>>>>
>>>>>> Any thoughts?
>>>>>>
>>>>>> Ralph
>>>>>>
>>>>>> PS. Guess I should clarify that I was not striving for true XML
>>>>>> interaction here, but rather a quasi-XML format that would help
>>>>>> you to filter the output. I have no problem trying to get to
>>>>>> something more formally correct, but it could be tricky in some
>>>>>> places to achieve it due to the inherent async nature of the
>>>>>> beast.
>>>>>>
>>>>>>
>>>>>> On Jan 13, 2009, at 12:17 PM, Greg Watson wrote:
>>>>>>
>>>>>>> Ralph,
>>>>>>>
>>>>>>> The XML is looking better now, but there is still one problem.
>>>>>>> To be valid, there needs to be only one root element, but
>>>>>>> currently you don't have any (or many). So rather than:
>>>>>>>
>>>>>>> <noderesolve name="node0" resolved="Jarrah.local"/>
>>>>>>> <noderesolve name="node1" resolved="Jarrah.local"/>
>>>>>>> <map>
>>>>>>> <host name="Jarrah.local" slots="8" max_slots="0">
>>>>>>> <process rank="0"/>
>>>>>>> <process rank="1"/>
>>>>>>> <process rank="2"/>
>>>>>>> <process rank="3"/>
>>>>>>> <process rank="4"/>
>>>>>>> </host>
>>>>>>> </map>
>>>>>>>
>>>>>>> the XML should be:
>>>>>>>
>>>>>>> <map>
>>>>>>> <noderesolve name="node0" resolved="Jarrah.local"/>
>>>>>>> <noderesolve name="node1" resolved="Jarrah.local"/>
>>>>>>> <host name="Jarrah.local" slots="8" max_slots="0">
>>>>>>> <process rank="0"/>
>>>>>>> <process rank="1"/>
>>>>>>> <process rank="2"/>
>>>>>>> <process rank="3"/>
>>>>>>> <process rank="4"/>
>>>>>>> </host>
>>>>>>> </map>
>>>>>>>
>>>>>>> or:
>>>>>>>
>>>>>>> <ompi>
>>>>>>> <noderesolve name="node0" resolved="Jarrah.local"/>
>>>>>>> <noderesolve name="node1" resolved="Jarrah.local"/>
>>>>>>> <map>
>>>>>>> <host name="Jarrah.local" slots="8" max_slots="0">
>>>>>>> <process rank="0"/>
>>>>>>> <process rank="1"/>
>>>>>>> <process rank="2"/>
>>>>>>> <process rank="3"/>
>>>>>>> <process rank="4"/>
>>>>>>> </host>
>>>>>>> </map>
>>>>>>> </ompi>
>>>>>>>
>>>>>>> Would either of these be possible?
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Greg
>>>>>>>
>>>>>>> On Dec 8, 2008, at 2:18 PM, Greg Watson wrote:
>>>>>>>
>>>>>>>> Ok thanks. I'll test from trunk in future.
>>>>>>>>
>>>>>>>> Greg
>>>>>>>>
>>>>>>>> On Dec 8, 2008, at 2:05 PM, Ralph Castain wrote:
>>>>>>>>
>>>>>>>>> Working its way around the CMR process now.
>>>>>>>>>
>>>>>>>>> Might be easier in the future if we could test/debug this in
>>>>>>>>> the trunk, though. Otherwise, the CMR procedure will fall
>>>>>>>>> behind and a fix might miss a release window.
>>>>>>>>>
>>>>>>>>> Anyway, hopefully this one will make the 1.3.0 release cutoff.
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Ralph
>>>>>>>>>
>>>>>>>>> On Dec 8, 2008, at 9:56 AM, Greg Watson wrote:
>>>>>>>>>
>>>>>>>>>> Hi Ralph,
>>>>>>>>>>
>>>>>>>>>> This is now in 1.3rc2, thanks. However there are a couple
>>>>>>>>>> of problems. Here is what I see:
>>>>>>>>>>
>>>>>>>>>> [Jarrah.watson.ibm.com:58957] <noderesolve name="node0"
>>>>>>>>>> resolved="Jarrah.watson.ibm.com">
>>>>>>>>>>
>>>>>>>>>> For some reason each line is prefixed with "[...]", any
>>>>>>>>>> idea why this is? Also the end tag should be "/>" not ">".
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Greg
>>>>>>>>>>
>>>>>>>>>> On Nov 24, 2008, at 3:06 PM, Greg Watson wrote:
>>>>>>>>>>
>>>>>>>>>>> Great, thanks. I'll take a look once it comes over to 1.3.
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>>
>>>>>>>>>>> Greg
>>>>>>>>>>>
>>>>>>>>>>> On Nov 24, 2008, at 2:59 PM, Ralph Castain wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Yo Greg
>>>>>>>>>>>>
>>>>>>>>>>>> This is in the trunk as of r20032. I'll bring it over to
>>>>>>>>>>>> 1.3 in a few days.
>>>>>>>>>>>>
>>>>>>>>>>>> I implemented it as another MCA param
>>>>>>>>>>>> "orte_show_resolved_nodenames" so you can actually get
>>>>>>>>>>>> the info as you execute the job, if you want. The xml tag
>>>>>>>>>>>> is "noderesolve" - let me know if you need any changes.
>>>>>>>>>>>>
>>>>>>>>>>>> Ralph
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Oct 22, 2008, at 11:55 AM, Greg Watson wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Ralph,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I guess the issue for us is that we will have to run two
>>>>>>>>>>>>> commands to get the information we need. One to get the
>>>>>>>>>>>>> configuration information, such as version and MCA
>>>>>>>>>>>>> parameters, and one to get the host information, whereas
>>>>>>>>>>>>> it would seem more logical that this should all be
>>>>>>>>>>>>> available via some kind of "configuration discovery"
>>>>>>>>>>>>> command. I understand the issue with supplying the
>>>>>>>>>>>>> hostfile though, so maybe this just points at the need
>>>>>>>>>>>>> for us to separate configuration information from the
>>>>>>>>>>>>> host information. In any case, we'll work with what you
>>>>>>>>>>>>> think is best.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Greg
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Oct 20, 2008, at 4:49 PM, Ralph Castain wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hmmm...just to be sure we are all clear on this. The
>>>>>>>>>>>>>> reason we proposed to use mpirun is that "hostfile" has
>>>>>>>>>>>>>> no meaning outside of mpirun. That's why ompi_info
>>>>>>>>>>>>>> can't do anything in this regard.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> We have no idea what hostfile the user may specify
>>>>>>>>>>>>>> until we actually get the mpirun cmd line. They may
>>>>>>>>>>>>>> have specified a default-hostfile, but they could also
>>>>>>>>>>>>>> specify hostfiles for the individual app_contexts.
>>>>>>>>>>>>>> These may or may not include the node upon which mpirun
>>>>>>>>>>>>>> is executing.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> So the only way to provide you with a separate command
>>>>>>>>>>>>>> to get a hostfile<->nodename mapping would require you
>>>>>>>>>>>>>> to provide us with the default-hostifle and/or hostfile
>>>>>>>>>>>>>> cmd line options just as if you were issuing the mpirun
>>>>>>>>>>>>>> cmd. We just wouldn't launch - but it would be the
>>>>>>>>>>>>>> exact equivalent of doing "mpirun --do-not-launch".
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Am I missing something? If so, please do correct me - I
>>>>>>>>>>>>>> would be happy to provide a tool if that would make it
>>>>>>>>>>>>>> easier. Just not sure what that tool would do.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>> Ralph
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Oct 19, 2008, at 1:59 PM, Greg Watson wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Ralph,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> It seems a little strange to be using mpirun for this,
>>>>>>>>>>>>>>> but barring providing a separate command, or using
>>>>>>>>>>>>>>> ompi_info, I think this would solve our problem.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Greg
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Oct 17, 2008, at 10:46 AM, Ralph Castain wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Sorry for delay - had to ponder this one for awhile.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Jeff and I agree that adding something to ompi_info
>>>>>>>>>>>>>>>> would not be a good idea. Ompi_info has no knowledge
>>>>>>>>>>>>>>>> or understanding of hostfiles, and adding that
>>>>>>>>>>>>>>>> capability to it would be a major distortion of its
>>>>>>>>>>>>>>>> intended use.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> However, we think we can offer an alternative that
>>>>>>>>>>>>>>>> might better solve the problem. Remember, we now
>>>>>>>>>>>>>>>> treat hostfiles in a very different manner than
>>>>>>>>>>>>>>>> before - see the wiki page for a complete
>>>>>>>>>>>>>>>> description, or "man orte_hosts".
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> So the problem is that, to provide you with what you
>>>>>>>>>>>>>>>> want, we need to "dump" the information from whatever
>>>>>>>>>>>>>>>> default-hostfile was provided, and, if no default-
>>>>>>>>>>>>>>>> hostfile was provided, then the information from each
>>>>>>>>>>>>>>>> hostfile that was provided with an app_context.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The best way we could think of to do this is to add
>>>>>>>>>>>>>>>> another mpirun cmd line option --dump-hostfiles that
>>>>>>>>>>>>>>>> would output the line-by-line name from the hostfile
>>>>>>>>>>>>>>>> plus the name we resolved it to. Of course, --xml
>>>>>>>>>>>>>>>> would cause it to be in xml format.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Would that meet your needs?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Ralph
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Oct 15, 2008, at 3:12 PM, Greg Watson wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi Ralph,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> We've been discussing this back and forth a bit
>>>>>>>>>>>>>>>>> internally and don't really see an easy solution.
>>>>>>>>>>>>>>>>> Our problem is that Eclipse is not running on the
>>>>>>>>>>>>>>>>> head node, so gethostbyname will not necessarily
>>>>>>>>>>>>>>>>> resolve to the same address. For example, the
>>>>>>>>>>>>>>>>> hostfile might refer to the head node by an internal
>>>>>>>>>>>>>>>>> network address that is not visible to the outside
>>>>>>>>>>>>>>>>> world. Since gethostname also looks in /etc/hosts,
>>>>>>>>>>>>>>>>> it may resolve locally but not on a remote system.
>>>>>>>>>>>>>>>>> The only think I can think of would be, rather than
>>>>>>>>>>>>>>>>> us reading the hostfile directly as we do now, to
>>>>>>>>>>>>>>>>> provide an option to ompi_info that would dump the
>>>>>>>>>>>>>>>>> hostfile using the same rules that you apply when
>>>>>>>>>>>>>>>>> you're using the hostfile. Would that be feasible?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Greg
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Sep 22, 2008, at 4:25 PM, Ralph Castain wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Sorry for delay - was on vacation and am now trying
>>>>>>>>>>>>>>>>>> to work my way back to the surface.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I'm not sure I can fix this one for two reasons:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 1. In general, OMPI doesn't really care what name
>>>>>>>>>>>>>>>>>> is used for the node. However, the problem is that
>>>>>>>>>>>>>>>>>> it needs to be consistent. In this case, ORTE has
>>>>>>>>>>>>>>>>>> already used the name returned by gethostname to
>>>>>>>>>>>>>>>>>> create its session directory structure long before
>>>>>>>>>>>>>>>>>> mpirun reads a hostfile. This is why we retain the
>>>>>>>>>>>>>>>>>> value from gethostname instead of allowing it to be
>>>>>>>>>>>>>>>>>> overwritten by the name in whatever allocation we
>>>>>>>>>>>>>>>>>> are given. Using the name in hostfile would require
>>>>>>>>>>>>>>>>>> that I either find some way to remember any prior
>>>>>>>>>>>>>>>>>> name, or that I tear down and rebuild the session
>>>>>>>>>>>>>>>>>> directory tree - neither seems attractive nor
>>>>>>>>>>>>>>>>>> simple (e.g., what happens when the user provides
>>>>>>>>>>>>>>>>>> multiple entries in the hostfile for the node, each
>>>>>>>>>>>>>>>>>> with a different IP address based on another
>>>>>>>>>>>>>>>>>> interface in that node? Sounds crazy, but we have
>>>>>>>>>>>>>>>>>> already seen it done - which one do I use?).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 2. We don't actually store the hostfile info
>>>>>>>>>>>>>>>>>> anywhere - we just use it and forget it. For us to
>>>>>>>>>>>>>>>>>> add an XML attribute containing any hostfile-
>>>>>>>>>>>>>>>>>> related info would therefore require us to re-read
>>>>>>>>>>>>>>>>>> the hostfile. I could have it do that -only- in the
>>>>>>>>>>>>>>>>>> case of "XML output required", but it seems rather
>>>>>>>>>>>>>>>>>> ugly.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> An alternative might be for you to simply do a
>>>>>>>>>>>>>>>>>> "gethostbyname" lookup of the IP address or
>>>>>>>>>>>>>>>>>> hostname to see if it matches instead of just doing
>>>>>>>>>>>>>>>>>> a strcmp. This is what we have to do internally as
>>>>>>>>>>>>>>>>>> we frequently have problems with FQDN vs. non-FQDN
>>>>>>>>>>>>>>>>>> vs. IP addresses etc. If the local OS hasn't cached
>>>>>>>>>>>>>>>>>> the IP address for the node in question it can take
>>>>>>>>>>>>>>>>>> a little time to DNS resolve it, but otherwise
>>>>>>>>>>>>>>>>>> works fine.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I can point you to the code in OPAL that we use - I
>>>>>>>>>>>>>>>>>> would think something similar would be easy to
>>>>>>>>>>>>>>>>>> implement in your code and would readily solve the
>>>>>>>>>>>>>>>>>> problem.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Ralph
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Sep 19, 2008, at 7:18 AM, Greg Watson wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Ralph,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The problem we're seeing is just with the head
>>>>>>>>>>>>>>>>>>> node. If I specify a particular IP address for the
>>>>>>>>>>>>>>>>>>> head node in the hostfile, it gets changed to the
>>>>>>>>>>>>>>>>>>> FQDN when displayed in the map. This is a problem
>>>>>>>>>>>>>>>>>>> for us as we need to be able to match the two, and
>>>>>>>>>>>>>>>>>>> since we're not necessarily running on the head
>>>>>>>>>>>>>>>>>>> node, we can't always do the same resolution
>>>>>>>>>>>>>>>>>>> you're doing.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Would it be possible to use the same address that
>>>>>>>>>>>>>>>>>>> is specified in the hostfile, or alternatively
>>>>>>>>>>>>>>>>>>> provide an XML attribute that contains this
>>>>>>>>>>>>>>>>>>> information?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Greg
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Sep 11, 2008, at 9:06 AM, Ralph Castain wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Not in that regard, depending upon what you mean
>>>>>>>>>>>>>>>>>>>> by "recently". The only changes I am aware of wrt
>>>>>>>>>>>>>>>>>>>> nodes consisted of some changes to the order in
>>>>>>>>>>>>>>>>>>>> which we use the nodes when specified by hostfile
>>>>>>>>>>>>>>>>>>>> or -host, and a little #if protectionism needed
>>>>>>>>>>>>>>>>>>>> by Brian for the Cray port.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Are you seeing this for every node? Reason I ask:
>>>>>>>>>>>>>>>>>>>> I can't offhand think of anything in the code
>>>>>>>>>>>>>>>>>>>> base that would replace a host name with the FQDN
>>>>>>>>>>>>>>>>>>>> because we don't get that info for remote nodes.
>>>>>>>>>>>>>>>>>>>> The only exception is the head node (where mpirun
>>>>>>>>>>>>>>>>>>>> sits) - in that lone case, we default to the name
>>>>>>>>>>>>>>>>>>>> returned to us by gethostname(). We do that
>>>>>>>>>>>>>>>>>>>> because the head node is frequently accessible on
>>>>>>>>>>>>>>>>>>>> a more global basis than the compute nodes -
>>>>>>>>>>>>>>>>>>>> thus, the FQDN is required to ensure that there
>>>>>>>>>>>>>>>>>>>> is no address confusion on the network.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> If the user refers to compute nodes in a hostfile
>>>>>>>>>>>>>>>>>>>> or -host (or in an allocation from a resource
>>>>>>>>>>>>>>>>>>>> manager) by non-FQDN, we just assume they know
>>>>>>>>>>>>>>>>>>>> what they are doing and the name will correctly
>>>>>>>>>>>>>>>>>>>> resolve to a unique address.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Sep 10, 2008, at 9:45 AM, Greg Watson wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Has there been a change in the behavior of the -
>>>>>>>>>>>>>>>>>>>>> display-map option has changed recently in the
>>>>>>>>>>>>>>>>>>>>> 1.3 branch. We're now seeing the host name as a
>>>>>>>>>>>>>>>>>>>>> fully resolved DN rather than the entry that was
>>>>>>>>>>>>>>>>>>>>> specified in the hostfile. Is there any
>>>>>>>>>>>>>>>>>>>>> particular reason for this? If so, would it be
>>>>>>>>>>>>>>>>>>>>> possible to add the hostfile entry to the output
>>>>>>>>>>>>>>>>>>>>> since we need to be able to match the two?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Greg
>>>>>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>>>>> devel mailing list
>>>>>>>>>>>>>>>>>>>>> devel_at_[hidden]
>>>>>>>>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>>>> devel mailing list
>>>>>>>>>>>>>>>>>>>> devel_at_[hidden]
>>>>>>>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>>> devel mailing list
>>>>>>>>>>>>>>>>>>> devel_at_[hidden]
>>>>>>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>> devel mailing list
>>>>>>>>>>>>>>>>>> devel_at_[hidden]
>>>>>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>> devel mailing list
>>>>>>>>>>>>>>>>> devel_at_[hidden]
>>>>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>> devel mailing list
>>>>>>>>>>>>>>>> devel_at_[hidden]
>>>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> devel mailing list
>>>>>>>>>>>>>>> devel_at_[hidden]
>>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> devel mailing list
>>>>>>>>>>>>>> devel_at_[hidden]
>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> devel mailing list
>>>>>>>>>>>>> devel_at_[hidden]
>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> devel mailing list
>>>>>>>>>>>> devel_at_[hidden]
>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> devel mailing list
>>>>>>>>>>> devel_at_[hidden]
>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> devel mailing list
>>>>>>>>>> devel_at_[hidden]
>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> devel mailing list
>>>>>>>>> devel_at_[hidden]
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> devel mailing list
>>>>>>>> devel_at_[hidden]
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> devel mailing list
>>>>>>> devel_at_[hidden]
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>
>>>>>> _______________________________________________
>>>>>> devel mailing list
>>>>>> devel_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel