Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] How are the Open MPI processes spawned?
From: Paul Kapinos (kapinos_at_[hidden])
Date: 2011-11-24 13:49:30


Hello Ralph, Terry, all!

again, two news: the good one and the second one.

Ralph Castain wrote:
> Yes, that would indeed break things. The 1.5 series isn't correctly
> checking connections across multiple interfaces until it finds one that
> works - it just uses the first one it sees. :-(

Yahhh!!
This behaviour - catch a random interface and hang forever if something
is wrong with it - is somewhat less than perfect.

 From my perspective - the users one - OpenMPI should try to use eitcher
*all* available networks (as 1.4 it does...), starting with the high
performance ones, or *only* those interfaces on which the hostnames from
the hostfile are bound to.

Also, there should be timeouts (if you cannot connect to a node within a
minute you probably will never ever be connected...)

If some connection runs into a timeout a warning would be great (and a
hint to take off the interface by oob_tcp_if_exclude, btl_tcp_if_exclude).

Should it not?
Maybe you can file it as a "call for enhancement"...

> The solution is to specify -mca oob_tcp_if_include ib0. This will direct
> the run-time wireup across the IP over IB interface.
>
> You will also need the -mca btl_tcp_if_include ib0 as well so the MPI
> comm goes exclusively over that network.

YES! This works. Adding
-mca oob_tcp_if_include ib0 -mca btl_tcp_if_include ib0
to the command line of mpiexec helps me to run the 1.5.x programs, so I
believe this is the workaround.

Many thanks for this hint, Ralph! My fail to not to find it in the FAQ
(I was so close :o) http://www.open-mpi.org/faq/?category=tcp#tcp-selection

But then I ran into yet another one issue. In
http://www.open-mpi.org/faq/?category=tuning#setting-mca-params
the way to define MCA parameters over environment variables is described.

I tried it:
$ export OMPI_MCA_oob_tcp_if_include=ib0
$ export OMPI_MCA_btl_tcp_if_include=ib0

I checked it:
$ ompi_info --param all all | grep oob_tcp_if_include
                  MCA oob: parameter "oob_tcp_if_include" (current
value: <ib0>, data source: environment or cmdline)
$ ompi_info --param all all | grep btl_tcp_if_include
                  MCA btl: parameter "btl_tcp_if_include" (current
value: <ib0>, data source: environment or cmdline)

But then I get again the hang-up issue!

==> seem, mpiexec does not understand these environment variables! and
only get the command line options. This should not be so?

(I also tried to advise to provide the envvars by -x
OMPI_MCA_oob_tcp_if_include -x OMPI_MCA_btl_tcp_if_include - nothing
changed. Well, they are OMPI_ variables and should be provided in any case).

Best wishes and many thanks for all,

Paul Kapinos

> Specifying both include and
> exclude should generate an error as those are mutually exclusive options
> - I think this was also missed in early 1.5 releases and was recently
> patched.
>
> HTH
> Ralph
>
>
> On Nov 23, 2011, at 12:14 PM, TERRY DONTJE wrote:
>
>> On 11/23/2011 2:02 PM, Paul Kapinos wrote:
>>> Hello Ralph, hello all,
>>>
>>> Two news, as usual a good and a bad one.
>>>
>>> The good: we believe to find out *why* it hangs
>>>
>>> The bad: it seem for me, this is a bug or at least undocumented
>>> feature of Open MPI /1.5.x.
>>>
>>> In detail:
>>> As said, we see mystery hang-ups if starting on some nodes using some
>>> permutation of hostnames. Usually removing "some bad" nodes helps,
>>> sometimes a permutation of node names in the hostfile is enough(!).
>>> The behaviour is reproducible.
>>>
>>> The machines have at least 2 networks:
>>>
>>> *eth0* is used for installation, monitoring, ... - this ethernet is
>>> very slim
>>>
>>> *ib0* - is the "IP over IB" interface and is used for everything: the
>>> file systems, ssh and so on. The hostnames are bound to the ib0
>>> network; our idea was not to use eth0 for MPI at all.
>>>
>>> all machines are available from any over ib0 (are in one network).
>>>
>>> But on eth0 there are at least two different networks; especially the
>>> computer linuxbsc025 is in different network than the others and is
>>> not reachable from other nodes over eth0! (but reachable over ib0.
>>> The name used in the hostfile is resolved to the IP of ib0 ).
>>>
>>> So I believe that Open MPI /1.5.x tries to communicate over eth0 and
>>> cannot do it, and hangs. The /1.4.3 does not hang, so this issue is
>>> 1.5.x-specific (seen in 1.5.3 and 1.5.4). A bug?
>>>
>>> I also tried to disable the eth0 completely:
>>>
>>> $ mpiexec -mca btl_tcp_if_exclude eth0,lo -mca btl_tcp_if_include
>>> ib0 ...
>>>
>> I believe if you give "-mca btl_tcp_if_include ib0" you do not need to
>> specify the exclude parameter.
>>> ...but this does not help. All right, the above command should
>>> disable the usage of eth0 for MPI communication itself, but it hangs
>>> just before the MPI is started, isn't it? (because one process lacks,
>>> the MPI_INIT cannot be passed)
>>>
>> By "just before the MPI is started" do you mean while orte is
>> launching the processes.
>> I wonder if you need to specify "-mca oob_tcp_if_include ib0" also but
>> I think that may depend on which oob you are using.
>>> Now a question: is there a way to forbid the mpiexec to use some
>>> interfaces at all?
>>>
>>> Best wishes,
>>>
>>> Paul Kapinos
>>>
>>> P.S. Of course we know about the good idea to bring all nodes into
>>> the same net on eth0, but at this point it is impossible due of
>>> technical reason[s]...
>>>
>>> P.S.2 I'm not sure that the issue is really rooted in the above
>>> mentioned misconfiguration of eth0, but I have no better idea at this
>>> point...
>>>
>>>
>>>>> The map seem to be correctly build, also the output if the daemons
>>>>> seem to be the same (see helloworld.txt)
>>>>
>>>> Unfortunately, it appears that OMPI was not built with
>>>> --enable-debug as there is no debug info in the output. Without a
>>>> debug installation of OMPI, the ability to determine the problem is
>>>> pretty limited.
>>>
>>> well, this will be the next option we will activate. We also have
>>> another issue here, on (not) using uDAPL..
>>>
>>>
>>>>
>>>>
>>>>>> You should also try putting that long list of nodes in a hostfile
>>>>>> - see if that makes a difference.
>>>>>> It will process the nodes thru a different code path, so if there
>>>>>> is some problem in --host,
>>>>>> this will tell us.
>>>>> No, with the host file instead of host list on command line the
>>>>> behaviour is the same.
>>>>>
>>>>> But, I just found out that the 1.4.3 does *not* hang on this
>>>>> constellation. The next thing I will try will be the installation
>>>>> of 1.5.4 :o)
>>>>>
>>>>> Best,
>>>>>
>>>>> Paul
>>>>>
>>>>> P.S. started:
>>>>>
>>>>> $ /opt/MPI/openmpi-1.5.3/linux/intel/bin/mpiexec --hostfile
>>>>> hostfile-mini -mca odls_base_verbose 5 --leave-session-attached
>>>>> --display-map helloworld 2>&1 | tee helloworld.txt
>>>>>
>>>>>
>>>>>
>>>>>> On Nov 21, 2011, at 9:33 AM, Paul Kapinos wrote:
>>>>>>> Hello Open MPI volks,
>>>>>>>
>>>>>>> We use OpenMPI 1.5.3 on our pretty new 1800+ nodes InfiniBand
>>>>>>> cluster, and we have some strange hangups if starting OpenMPI
>>>>>>> processes.
>>>>>>>
>>>>>>> The nodes are named linuxbsc001,linuxbsc002,... (with some lacuna
>>>>>>> due of offline nodes). Each node is accessible from each other
>>>>>>> over SSH (without password), also MPI programs between any two
>>>>>>> nodes are checked to run.
>>>>>>>
>>>>>>>
>>>>>>> So long, I tried to start some bigger number of processes, one
>>>>>>> process per node:
>>>>>>> $ mpiexec -np NN --host linuxbsc001,linuxbsc002,...
>>>>>>> MPI_FastTest.exe
>>>>>>>
>>>>>>> Now the problem: there are some constellations of names in the
>>>>>>> host list on which mpiexec reproducible hangs forever; and more
>>>>>>> surprising: other *permutation* of the *same* node names may run
>>>>>>> without any errors!
>>>>>>>
>>>>>>> Example: the command in laueft.txt runs OK, the command in
>>>>>>> haengt.txt hangs. Note: the only difference is that the node
>>>>>>> linuxbsc025 is put on the end of the host list. Amazed, too?
>>>>>>>
>>>>>>> Looking on the particular nodes during the above mpiexec hangs,
>>>>>>> we found the orted daemons started on *each* node and the binary
>>>>>>> on all but one node (orted.txt, MPI_FastTest.txt).
>>>>>>> Again amazing that the node with no user process started (leading
>>>>>>> to hangup in MPI_Init of all processes and thus to hangup, I
>>>>>>> believe) was always the same, linuxbsc005, which is NOT the
>>>>>>> permuted item linuxbsc025...
>>>>>>>
>>>>>>> This behaviour is reproducible. The hang-on only occure if the
>>>>>>> started application is a MPI application ("hostname" does not hang).
>>>>>>>
>>>>>>>
>>>>>>> Any Idea what is gonna on?
>>>>>>>
>>>>>>>
>>>>>>> Best,
>>>>>>>
>>>>>>> Paul Kapinos
>>>>>>>
>>>>>>>
>>>>>>> P.S: no alias names used, all names are real ones
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Dipl.-Inform. Paul Kapinos - High Performance Computing,
>>>>>>> RWTH Aachen University, Center for Computing and Communication
>>>>>>> Seffenter Weg 23, D 52074 Aachen (Germany)
>>>>>>> Tel: +49 241/80-24915
>>>>>>> linuxbsc001: STDOUT: 24323 ? SLl 0:00 MPI_FastTest.exe
>>>>>>> linuxbsc002: STDOUT: 2142 ? SLl 0:00 MPI_FastTest.exe
>>>>>>> linuxbsc003: STDOUT: 69266 ? SLl 0:00 MPI_FastTest.exe
>>>>>>> linuxbsc004: STDOUT: 58899 ? SLl 0:00 MPI_FastTest.exe
>>>>>>> linuxbsc006: STDOUT: 68255 ? SLl 0:00 MPI_FastTest.exe
>>>>>>> linuxbsc007: STDOUT: 62026 ? SLl 0:00 MPI_FastTest.exe
>>>>>>> linuxbsc008: STDOUT: 54221 ? SLl 0:00 MPI_FastTest.exe
>>>>>>> linuxbsc009: STDOUT: 55482 ? SLl 0:00 MPI_FastTest.exe
>>>>>>> linuxbsc010: STDOUT: 59380 ? SLl 0:00 MPI_FastTest.exe
>>>>>>> linuxbsc011: STDOUT: 58312 ? SLl 0:00 MPI_FastTest.exe
>>>>>>> linuxbsc014: STDOUT: 56013 ? SLl 0:00 MPI_FastTest.exe
>>>>>>> linuxbsc016: STDOUT: 58563 ? SLl 0:00 MPI_FastTest.exe
>>>>>>> linuxbsc017: STDOUT: 54693 ? SLl 0:00 MPI_FastTest.exe
>>>>>>> linuxbsc018: STDOUT: 54187 ? SLl 0:00 MPI_FastTest.exe
>>>>>>> linuxbsc020: STDOUT: 55811 ? SLl 0:00 MPI_FastTest.exe
>>>>>>> linuxbsc021: STDOUT: 54982 ? SLl 0:00 MPI_FastTest.exe
>>>>>>> linuxbsc022: STDOUT: 50032 ? SLl 0:00 MPI_FastTest.exe
>>>>>>> linuxbsc023: STDOUT: 54044 ? SLl 0:00 MPI_FastTest.exe
>>>>>>> linuxbsc024: STDOUT: 51247 ? SLl 0:00 MPI_FastTest.exe
>>>>>>> linuxbsc025: STDOUT: 18575 ? SLl 0:00 MPI_FastTest.exe
>>>>>>> linuxbsc027: STDOUT: 48969 ? SLl 0:00 MPI_FastTest.exe
>>>>>>> linuxbsc028: STDOUT: 52397 ? SLl 0:00 MPI_FastTest.exe
>>>>>>> linuxbsc029: STDOUT: 52780 ? SLl 0:00 MPI_FastTest.exe
>>>>>>> linuxbsc030: STDOUT: 47537 ? SLl 0:00 MPI_FastTest.exe
>>>>>>> linuxbsc031: STDOUT: 54609 ? SLl 0:00 MPI_FastTest.exe
>>>>>>> linuxbsc032: STDOUT: 52833 ? SLl 0:00 MPI_FastTest.exe
>>>>>>> $ timex /opt/MPI/openmpi-1.5.3/linux/intel/bin/mpiexec -np 27
>>>>>>> --host
>>>>>>> linuxbsc001,linuxbsc002,linuxbsc003,linuxbsc004,linuxbsc005,linuxbsc006,linuxbsc007,linuxbsc008,linuxbsc009,linuxbsc010,linuxbsc011,linuxbsc014,linuxbsc016,linuxbsc017,linuxbsc018,linuxbsc020,linuxbsc021,linuxbsc022,linuxbsc023,linuxbsc024,linuxbsc025,linuxbsc027,linuxbsc028,linuxbsc029,linuxbsc030,linuxbsc031,linuxbsc032
>>>>>>> MPI_FastTest.exe
>>>>>>> $ timex /opt/MPI/openmpi-1.5.3/linux/intel/bin/mpiexec -np 27
>>>>>>> --host
>>>>>>> linuxbsc001,linuxbsc002,linuxbsc003,linuxbsc004,linuxbsc005,linuxbsc006,linuxbsc007,linuxbsc008,linuxbsc009,linuxbsc010,linuxbsc011,linuxbsc014,linuxbsc016,linuxbsc017,linuxbsc018,linuxbsc020,linuxbsc021,linuxbsc022,linuxbsc023,linuxbsc024,linuxbsc027,linuxbsc028,linuxbsc029,linuxbsc030,linuxbsc031,linuxbsc032,linuxbsc025
>>>>>>> MPI_FastTest.exe
>>>>>>> linuxbsc001: STDOUT: 24322 ? Ss 0:00
>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess
>>>>>>> env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 1 -mca
>>>>>>> orte_ess_num_procs 28 --hnp-uri
>>>>>>> 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>> linuxbsc002: STDOUT: 2141 ? Ss 0:00
>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess
>>>>>>> env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 2 -mca
>>>>>>> orte_ess_num_procs 28 --hnp-uri
>>>>>>> 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>> linuxbsc003: STDOUT: 69265 ? Ss 0:00
>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess
>>>>>>> env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 3 -mca
>>>>>>> orte_ess_num_procs 28 --hnp-uri
>>>>>>> 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>> linuxbsc004: STDOUT: 58898 ? Ss 0:00
>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess
>>>>>>> env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 4 -mca
>>>>>>> orte_ess_num_procs 28 --hnp-uri
>>>>>>> 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>> linuxbsc005: STDOUT: 65642 ? Ss 0:00
>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess
>>>>>>> env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 5 -mca
>>>>>>> orte_ess_num_procs 28 --hnp-uri
>>>>>>> 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>> linuxbsc006: STDOUT: 68254 ? Ss 0:00
>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess
>>>>>>> env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 6 -mca
>>>>>>> orte_ess_num_procs 28 --hnp-uri
>>>>>>> 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>> linuxbsc007: STDOUT: 62025 ? Ss 0:00
>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess
>>>>>>> env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 7 -mca
>>>>>>> orte_ess_num_procs 28 --hnp-uri
>>>>>>> 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>> linuxbsc008: STDOUT: 54220 ? Ss 0:00
>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess
>>>>>>> env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 8 -mca
>>>>>>> orte_ess_num_procs 28 --hnp-uri
>>>>>>> 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>> linuxbsc009: STDOUT: 55481 ? Ss 0:00
>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess
>>>>>>> env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 9 -mca
>>>>>>> orte_ess_num_procs 28 --hnp-uri
>>>>>>> 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>> linuxbsc010: STDOUT: 59379 ? Ss 0:00
>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess
>>>>>>> env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 10 -mca
>>>>>>> orte_ess_num_procs 28 --hnp-uri
>>>>>>> 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>> linuxbsc011: STDOUT: 58311 ? Ss 0:00
>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess
>>>>>>> env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 11 -mca
>>>>>>> orte_ess_num_procs 28 --hnp-uri
>>>>>>> 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>> linuxbsc014: STDOUT: 56012 ? Ss 0:00
>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess
>>>>>>> env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 12 -mca
>>>>>>> orte_ess_num_procs 28 --hnp-uri
>>>>>>> 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>> linuxbsc016: STDOUT: 58562 ? Ss 0:00
>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess
>>>>>>> env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 13 -mca
>>>>>>> orte_ess_num_procs 28 --hnp-uri
>>>>>>> 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>> linuxbsc017: STDOUT: 54692 ? Ss 0:00
>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess
>>>>>>> env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 14 -mca
>>>>>>> orte_ess_num_procs 28 --hnp-uri
>>>>>>> 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>> linuxbsc018: STDOUT: 54186 ? Ss 0:00
>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess
>>>>>>> env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 15 -mca
>>>>>>> orte_ess_num_procs 28 --hnp-uri
>>>>>>> 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>> linuxbsc020: STDOUT: 55810 ? Ss 0:00
>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess
>>>>>>> env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 16 -mca
>>>>>>> orte_ess_num_procs 28 --hnp-uri
>>>>>>> 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>> linuxbsc021: STDOUT: 54981 ? Ss 0:00
>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess
>>>>>>> env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 17 -mca
>>>>>>> orte_ess_num_procs 28 --hnp-uri
>>>>>>> 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>> linuxbsc022: STDOUT: 50031 ? Ss 0:00
>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess
>>>>>>> env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 18 -mca
>>>>>>> orte_ess_num_procs 28 --hnp-uri
>>>>>>> 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>> linuxbsc023: STDOUT: 54043 ? Ss 0:00
>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess
>>>>>>> env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 19 -mca
>>>>>>> orte_ess_num_procs 28 --hnp-uri
>>>>>>> 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>> linuxbsc024: STDOUT: 51246 ? Ss 0:00
>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess
>>>>>>> env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 20 -mca
>>>>>>> orte_ess_num_procs 28 --hnp-uri
>>>>>>> 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>> linuxbsc025: STDOUT: 18574 ? Ss 0:00
>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess
>>>>>>> env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 21 -mca
>>>>>>> orte_ess_num_procs 28 --hnp-uri
>>>>>>> 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>> linuxbsc027: STDOUT: 48968 ? Ss 0:00
>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess
>>>>>>> env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 22 -mca
>>>>>>> orte_ess_num_procs 28 --hnp-uri
>>>>>>> 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>> linuxbsc028: STDOUT: 52396 ? Ss 0:00
>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess
>>>>>>> env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 23 -mca
>>>>>>> orte_ess_num_procs 28 --hnp-uri
>>>>>>> 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>> linuxbsc029: STDOUT: 52779 ? Ss 0:00
>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess
>>>>>>> env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 24 -mca
>>>>>>> orte_ess_num_procs 28 --hnp-uri
>>>>>>> 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>> linuxbsc030: STDOUT: 47536 ? Ss 0:00
>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess
>>>>>>> env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 25 -mca
>>>>>>> orte_ess_num_procs 28 --hnp-uri
>>>>>>> 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>> linuxbsc031: STDOUT: 54608 ? Ss 0:00
>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess
>>>>>>> env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 26 -mca
>>>>>>> orte_ess_num_procs 28 --hnp-uri
>>>>>>> 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>> linuxbsc032: STDOUT: 52832 ? Ss 0:00
>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess
>>>>>>> env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 27 -mca
>>>>>>> orte_ess_num_procs 28 --hnp-uri
>>>>>>> 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> users_at_[hidden]
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> users_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>> --
>>>>> Dipl.-Inform. Paul Kapinos - High Performance Computing,
>>>>> RWTH Aachen University, Center for Computing and Communication
>>>>> Seffenter Weg 23, D 52074 Aachen (Germany)
>>>>> Tel: +49 241/80-24915
>>>>> linuxbsc005 slots=1
>>>>> linuxbsc006 slots=1
>>>>> linuxbsc007 slots=1
>>>>> linuxbsc008 slots=1
>>>>> linuxbsc009 slots=1
>>>>> linuxbsc010 slots=1
>>>>> linuxbsc011 slots=1
>>>>> linuxbsc014 slots=1
>>>>> linuxbsc016 slots=1
>>>>> linuxbsc017 slots=1
>>>>> linuxbsc018 slots=1
>>>>> linuxbsc020 slots=1
>>>>> linuxbsc021 slots=1
>>>>> linuxbsc022 slots=1
>>>>> linuxbsc023 slots=1
>>>>> linuxbsc024 slots=1
>>>>> linuxbsc025 slots=1[linuxc2.rz.RWTH-Aachen.DE:22229]
>>>>> mca:base:select:( odls) Querying component [default]
>>>>> [linuxc2.rz.RWTH-Aachen.DE:22229] mca:base:select:( odls) Query of
>>>>> component [default] set priority to 1
>>>>> [linuxc2.rz.RWTH-Aachen.DE:22229] mca:base:select:( odls) Selected
>>>>> component [default]
>>>>>
>>>>> ======================== JOB MAP ========================
>>>>>
>>>>> Data for node: linuxbsc005 Num procs: 1
>>>>> Process OMPI jobid: [87,1] Process rank: 0
>>>>>
>>>>> Data for node: linuxbsc006 Num procs: 1
>>>>> Process OMPI jobid: [87,1] Process rank: 1
>>>>>
>>>>> Data for node: linuxbsc007 Num procs: 1
>>>>> Process OMPI jobid: [87,1] Process rank: 2
>>>>>
>>>>> Data for node: linuxbsc008 Num procs: 1
>>>>> Process OMPI jobid: [87,1] Process rank: 3
>>>>>
>>>>> Data for node: linuxbsc009 Num procs: 1
>>>>> Process OMPI jobid: [87,1] Process rank: 4
>>>>>
>>>>> Data for node: linuxbsc010 Num procs: 1
>>>>> Process OMPI jobid: [87,1] Process rank: 5
>>>>>
>>>>> Data for node: linuxbsc011 Num procs: 1
>>>>> Process OMPI jobid: [87,1] Process rank: 6
>>>>>
>>>>> Data for node: linuxbsc014 Num procs: 1
>>>>> Process OMPI jobid: [87,1] Process rank: 7
>>>>>
>>>>> Data for node: linuxbsc016 Num procs: 1
>>>>> Process OMPI jobid: [87,1] Process rank: 8
>>>>>
>>>>> Data for node: linuxbsc017 Num procs: 1
>>>>> Process OMPI jobid: [87,1] Process rank: 9
>>>>>
>>>>> Data for node: linuxbsc018 Num procs: 1
>>>>> Process OMPI jobid: [87,1] Process rank: 10
>>>>>
>>>>> Data for node: linuxbsc020 Num procs: 1
>>>>> Process OMPI jobid: [87,1] Process rank: 11
>>>>>
>>>>> Data for node: linuxbsc021 Num procs: 1
>>>>> Process OMPI jobid: [87,1] Process rank: 12
>>>>>
>>>>> Data for node: linuxbsc022 Num procs: 1
>>>>> Process OMPI jobid: [87,1] Process rank: 13
>>>>>
>>>>> Data for node: linuxbsc023 Num procs: 1
>>>>> Process OMPI jobid: [87,1] Process rank: 14
>>>>>
>>>>> Data for node: linuxbsc024 Num procs: 1
>>>>> Process OMPI jobid: [87,1] Process rank: 15
>>>>>
>>>>> Data for node: linuxbsc025 Num procs: 1
>>>>> Process OMPI jobid: [87,1] Process rank: 16
>>>>>
>>>>> =============================================================
>>>>> [linuxbsc007.rz.RWTH-Aachen.DE:07574] mca:base:select:( odls)
>>>>> Querying component [default]
>>>>> [linuxbsc007.rz.RWTH-Aachen.DE:07574] mca:base:select:( odls) Query
>>>>> of component [default] set priority to 1
>>>>> [linuxbsc007.rz.RWTH-Aachen.DE:07574] mca:base:select:( odls)
>>>>> Selected component [default]
>>>>> [linuxbsc016.rz.RWTH-Aachen.DE:03146] mca:base:select:( odls)
>>>>> Querying component [default]
>>>>> [linuxbsc016.rz.RWTH-Aachen.DE:03146] mca:base:select:( odls) Query
>>>>> of component [default] set priority to 1
>>>>> [linuxbsc016.rz.RWTH-Aachen.DE:03146] mca:base:select:( odls)
>>>>> Selected component [default]
>>>>> [linuxbsc005.rz.RWTH-Aachen.DE:22051] mca:base:select:( odls)
>>>>> Querying component [default]
>>>>> [linuxbsc005.rz.RWTH-Aachen.DE:22051] mca:base:select:( odls) Query
>>>>> of component [default] set priority to 1
>>>>> [linuxbsc005.rz.RWTH-Aachen.DE:22051] mca:base:select:( odls)
>>>>> Selected component [default]
>>>>> [linuxbsc011.rz.RWTH-Aachen.DE:07131] mca:base:select:( odls)
>>>>> Querying component [default]
>>>>> [linuxbsc011.rz.RWTH-Aachen.DE:07131] mca:base:select:( odls) Query
>>>>> of component [default] set priority to 1
>>>>> [linuxbsc011.rz.RWTH-Aachen.DE:07131] mca:base:select:( odls)
>>>>> Selected component [default]
>>>>> [linuxbsc025.rz.RWTH-Aachen.DE:43153] mca:base:select:( odls)
>>>>> Querying component [default]
>>>>> [linuxbsc025.rz.RWTH-Aachen.DE:43153] mca:base:select:( odls) Query
>>>>> of component [default] set priority to 1
>>>>> [linuxbsc025.rz.RWTH-Aachen.DE:43153] mca:base:select:( odls)
>>>>> Selected component [default]
>>>>> [linuxbsc017.rz.RWTH-Aachen.DE:05044] mca:base:select:( odls)
>>>>> Querying component [default]
>>>>> [linuxbsc017.rz.RWTH-Aachen.DE:05044] mca:base:select:( odls) Query
>>>>> of component [default] set priority to 1
>>>>> [linuxbsc017.rz.RWTH-Aachen.DE:05044] mca:base:select:( odls)
>>>>> Selected component [default]
>>>>> [linuxbsc018.rz.RWTH-Aachen.DE:01840] mca:base:select:( odls)
>>>>> Querying component [default]
>>>>> [linuxbsc018.rz.RWTH-Aachen.DE:01840] mca:base:select:( odls) Query
>>>>> of component [default] set priority to 1
>>>>> [linuxbsc018.rz.RWTH-Aachen.DE:01840] mca:base:select:( odls)
>>>>> Selected component [default]
>>>>> [linuxbsc024.rz.RWTH-Aachen.DE:79549] mca:base:select:( odls)
>>>>> Querying component [default]
>>>>> [linuxbsc024.rz.RWTH-Aachen.DE:79549] mca:base:select:( odls) Query
>>>>> of component [default] set priority to 1
>>>>> [linuxbsc024.rz.RWTH-Aachen.DE:79549] mca:base:select:( odls)
>>>>> Selected component [default]
>>>>> [linuxbsc022.rz.RWTH-Aachen.DE:73501] mca:base:select:( odls)
>>>>> Querying component [default]
>>>>> [linuxbsc022.rz.RWTH-Aachen.DE:73501] mca:base:select:( odls) Query
>>>>> of component [default] set priority to 1
>>>>> [linuxbsc022.rz.RWTH-Aachen.DE:73501] mca:base:select:( odls)
>>>>> Selected component [default]
>>>>> [linuxbsc023.rz.RWTH-Aachen.DE:03364] mca:base:select:( odls)
>>>>> Querying component [default]
>>>>> [linuxbsc023.rz.RWTH-Aachen.DE:03364] mca:base:select:( odls) Query
>>>>> of component [default] set priority to 1
>>>>> [linuxbsc023.rz.RWTH-Aachen.DE:03364] mca:base:select:( odls)
>>>>> Selected component [default]
>>>>> [linuxbsc006.rz.RWTH-Aachen.DE:16811] mca:base:select:( odls)
>>>>> Querying component [default]
>>>>> [linuxbsc006.rz.RWTH-Aachen.DE:16811] mca:base:select:( odls) Query
>>>>> of component [default] set priority to 1
>>>>> [linuxbsc006.rz.RWTH-Aachen.DE:16811] mca:base:select:( odls)
>>>>> Selected component [default]
>>>>> [linuxbsc014.rz.RWTH-Aachen.DE:10206] mca:base:select:( odls)
>>>>> Querying component [default]
>>>>> [linuxbsc014.rz.RWTH-Aachen.DE:10206] mca:base:select:( odls) Query
>>>>> of component [default] set priority to 1
>>>>> [linuxbsc014.rz.RWTH-Aachen.DE:10206] mca:base:select:( odls)
>>>>> Selected component [default]
>>>>> [linuxbsc008.rz.RWTH-Aachen.DE:00858] mca:base:select:( odls)
>>>>> Querying component [default]
>>>>> [linuxbsc008.rz.RWTH-Aachen.DE:00858] mca:base:select:( odls) Query
>>>>> of component [default] set priority to 1
>>>>> [linuxbsc008.rz.RWTH-Aachen.DE:00858] mca:base:select:( odls)
>>>>> Selected component [default]
>>>>> [linuxbsc010.rz.RWTH-Aachen.DE:09727] mca:base:select:( odls)
>>>>> Querying component [default]
>>>>> [linuxbsc010.rz.RWTH-Aachen.DE:09727] mca:base:select:( odls) Query
>>>>> of component [default] set priority to 1
>>>>> [linuxbsc010.rz.RWTH-Aachen.DE:09727] mca:base:select:( odls)
>>>>> Selected component [default]
>>>>> [linuxbsc020.rz.RWTH-Aachen.DE:06680] mca:base:select:( odls)
>>>>> Querying component [default]
>>>>> [linuxbsc020.rz.RWTH-Aachen.DE:06680] mca:base:select:( odls) Query
>>>>> of component [default] set priority to 1
>>>>> [linuxbsc020.rz.RWTH-Aachen.DE:06680] mca:base:select:( odls)
>>>>> Selected component [default]
>>>>> [linuxbsc009.rz.RWTH-Aachen.DE:05145] mca:base:select:( odls)
>>>>> Querying component [default]
>>>>> [linuxbsc009.rz.RWTH-Aachen.DE:05145] mca:base:select:( odls) Query
>>>>> of component [default] set priority to 1
>>>>> [linuxbsc009.rz.RWTH-Aachen.DE:05145] mca:base:select:( odls)
>>>>> Selected component [default]
>>>>> [linuxbsc021.rz.RWTH-Aachen.DE:01405] mca:base:select:( odls)
>>>>> Querying component [default]
>>>>> [linuxbsc021.rz.RWTH-Aachen.DE:01405] mca:base:select:( odls) Query
>>>>> of component [default] set priority to 1
>>>>> [linuxbsc021.rz.RWTH-Aachen.DE:01405] mca:base:select:( odls)
>>>>> Selected component [default]
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> --
>> <Mail Attachment.gif>
>> Terry D. Dontje | Principal Software Engineer
>> Developer Tools Engineering | +1.781.442.2631
>> Oracle * - Performance Technologies*
>> 95 Network Drive, Burlington, MA 01803
>> Email terry.dontje_at_[hidden] <mailto:terry.dontje_at_[hidden]>
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden] <mailto:users_at_[hidden]>
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
RWTH Aachen University, Center for Computing and Communication
Seffenter Weg 23,  D 52074  Aachen (Germany)
Tel: +49 241/80-24915