Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] How are the Open MPI processes spawned?
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2011-11-28 17:39:21


(off list)

Are you sure about OMPI_MCA_* params not being treated specially? I know for a fact that they *used* to be. I.e., we bundled up all env variables that began with OMPI_MCA_* and sent them with the job to back-end nodes. It allowed sysadmins to set global MCA param values without editing the MCA param file on every node.

It looks like this is still happening on the trunk:

[14:38] svbu-mpi:~ % cat run
#!/bin/csh -f

echo on `hostname`, foo is: $OMPI_MCA_foo
exit 0
[14:38] svbu-mpi:~ % setenv OMPI_MCA_foo bar
[14:38] svbu-mpi:~ % ./run
on svbu-mpi.cisco.com, foo is: bar
[14:38] svbu-mpi:~ % mpirun -np 2 --bynode run
on svbu-mpi044, foo is: bar
on svbu-mpi043, foo is: bar
[14:38] svbu-mpi:~ % unsetenv OMPI_MCA_foo
[14:38] svbu-mpi:~ % mpirun -np 2 --bynode run
OMPI_MCA_foo: Undefined variable.
OMPI_MCA_foo: Undefined variable.
-------------------------------------------------------
While the primary job terminated normally, 2 processes returned
non-zero exit codes.. Further examination may be required.
-------------------------------------------------------
[14:38] svbu-mpi:~ %

(I did not read this thread too carefully, so perhaps I missed an inference in here somewhere...)

On Nov 25, 2011, at 5:21 PM, Ralph Castain wrote:

>
> On Nov 25, 2011, at 12:29 PM, Paul Kapinos wrote:
>
>> Hello again,
>>
>>>> Ralph Castain wrote:
>>>>> Yes, that would indeed break things. The 1.5 series isn't correctly checking connections across multiple interfaces until it finds one that works - it just uses the first one it sees. :-(
>>>> Yahhh!!
>>>> This behaviour - catch a random interface and hang forever if something is wrong with it - is somewhat less than perfect.
>>>>
>>>> From my perspective - the users one - OpenMPI should try to use eitcher *all* available networks (as 1.4 it does...), starting with the high performance ones, or *only* those interfaces on which the hostnames from the hostfile are bound to.
>>> It is indeed supposed to do the former - as I implied, this is a bug in the 1.5 series.
>>
>> Thanks for clarification. I was not sure about this is a bug or a feature :-)
>>
>>
>>
>>>> Also, there should be timeouts (if you cannot connect to a node within a minute you probably will never ever be connected...)
>>> We have debated about this for some time - there is a timeout mca param one can set, but we'll consider again making it default.
>>>> If some connection runs into a timeout a warning would be great (and a hint to take off the interface by oob_tcp_if_exclude, btl_tcp_if_exclude).
>>>>
>>>> Should it not?
>>>> Maybe you can file it as a "call for enhancement"...
>>> Probably the right approach at this time.
>>
>> Ahhh.. sorry, did not understand what you mean.
>> Did you filed it, or someone else, or should I do it in some way? Or should not?
>
> I'll take care of it, and copy you on the ticket so you can see what happens.
>
> I'll also do the same for the connection bug - sorry for the problem :-(
>
>
>>
>>
>>
>>
>>
>>
>>>> But then I ran into yet another one issue. In http://www.open-mpi.org/faq/?category=tuning#setting-mca-params
>>>> the way to define MCA parameters over environment variables is described.
>>>>
>>>> I tried it:
>>>> $ export OMPI_MCA_oob_tcp_if_include=ib0
>>>> $ export OMPI_MCA_btl_tcp_if_include=ib0
>>>>
>>>>
>>>> I checked it:
>>>> $ ompi_info --param all all | grep oob_tcp_if_include
>>>> MCA oob: parameter "oob_tcp_if_include" (current value: <ib0>, data source: environment or cmdline)
>>>> $ ompi_info --param all all | grep btl_tcp_if_include
>>>> MCA btl: parameter "btl_tcp_if_include" (current value: <ib0>, data source: environment or cmdline)
>>>>
>>>>
>>>> But then I get again the hang-up issue!
>>>>
>>>> ==> seem, mpiexec does not understand these environment variables! and only get the command line options. This should not be so?
>>> No, that isn't what is happening. The problem lies in the behavior of rsh/ssh. This environment does not forward environmental variables. Because of limits on cmd line length, we don't automatically forward MCA params from the environment, but only from the cmd line. It is an annoying limitation, but one outside our control.
>>
>> We know about "ssh does not forward environmental variables." But in this case, are these parameters not the parameters of mpiexec itself, too?
>>
>> The crucial thing is, that setting of the parameters works over the command line but *does not work* over the envvar way (as in http://www.open-mpi.org/faq/?category=tuning#setting-mca-params described). This looks like a bug for me!
>>
>>
>>
>>
>>
>>> Put those envars in the default mca param file and the problem will be resolved.
>>
>> You mean e.g. $prefix/etc/openmpi-mca-params.conf as described in 4. of http://www.open-mpi.org/faq/?category=tuning#setting-mca-params
>>
>> Well, this is possible, but not flexible enough for us (because there are some machines which only can run if the parameters are *not* set - on those the ssh goes just over these eth0 devices).
>>
>> By now we use the command line parameters and hope the envvar way will work sometimes.
>>
>>
>>>> (I also tried to advise to provide the envvars by -x OMPI_MCA_oob_tcp_if_include -x OMPI_MCA_btl_tcp_if_include - nothing changed.
>>> I'm surprised by that - they should be picked up and forwarded. Could be a bug
>>
>> Well, I also mean this is a bug, but as said not on providing the values of envvars but on detecting of these parameters at all. Or maybe on both.
>>
>>
>>
>>
>>>> Well, they are OMPI_ variables and should be provided in any case).
>>> No, they aren't - they are not treated differently than any other envar.
>>
>> [after performing some RTFM...]
>> at least the man page of mpiexec says, the OMPI_ environment variables are always provided and thus treated *differently* than other envvars:
>>
>> $ man mpiexec
>> ....
>> Exported Environment Variables
>> All environment variables that are named in the form OMPI_* will automatically be exported to new processes on the local and remote nodes.
>>
>> So, tells the man page lies, or this is an removed feature, or something else?
>>
>>
>> Best wishes,
>>
>> Paul Kapinos
>>
>>
>>
>>
>>
>>>>> Specifying both include and exclude should generate an error as those are mutually exclusive options - I think this was also missed in early 1.5 releases and was recently patched.
>>>>> HTH
>>>>> Ralph
>>>>> On Nov 23, 2011, at 12:14 PM, TERRY DONTJE wrote:
>>>>>> On 11/23/2011 2:02 PM, Paul Kapinos wrote:
>>>>>>> Hello Ralph, hello all,
>>>>>>>
>>>>>>> Two news, as usual a good and a bad one.
>>>>>>>
>>>>>>> The good: we believe to find out *why* it hangs
>>>>>>>
>>>>>>> The bad: it seem for me, this is a bug or at least undocumented feature of Open MPI /1.5.x.
>>>>>>>
>>>>>>> In detail:
>>>>>>> As said, we see mystery hang-ups if starting on some nodes using some permutation of hostnames. Usually removing "some bad" nodes helps, sometimes a permutation of node names in the hostfile is enough(!). The behaviour is reproducible.
>>>>>>>
>>>>>>> The machines have at least 2 networks:
>>>>>>>
>>>>>>> *eth0* is used for installation, monitoring, ... - this ethernet is very slim
>>>>>>>
>>>>>>> *ib0* - is the "IP over IB" interface and is used for everything: the file systems, ssh and so on. The hostnames are bound to the ib0 network; our idea was not to use eth0 for MPI at all.
>>>>>>>
>>>>>>> all machines are available from any over ib0 (are in one network).
>>>>>>>
>>>>>>> But on eth0 there are at least two different networks; especially the computer linuxbsc025 is in different network than the others and is not reachable from other nodes over eth0! (but reachable over ib0. The name used in the hostfile is resolved to the IP of ib0 ).
>>>>>>>
>>>>>>> So I believe that Open MPI /1.5.x tries to communicate over eth0 and cannot do it, and hangs. The /1.4.3 does not hang, so this issue is 1.5.x-specific (seen in 1.5.3 and 1.5.4). A bug?
>>>>>>>
>>>>>>> I also tried to disable the eth0 completely:
>>>>>>>
>>>>>>> $ mpiexec -mca btl_tcp_if_exclude eth0,lo -mca btl_tcp_if_include ib0 ...
>>>>>>>
>>>>>> I believe if you give "-mca btl_tcp_if_include ib0" you do not need to specify the exclude parameter.
>>>>>>> ...but this does not help. All right, the above command should disable the usage of eth0 for MPI communication itself, but it hangs just before the MPI is started, isn't it? (because one process lacks, the MPI_INIT cannot be passed)
>>>>>>>
>>>>>> By "just before the MPI is started" do you mean while orte is launching the processes.
>>>>>> I wonder if you need to specify "-mca oob_tcp_if_include ib0" also but I think that may depend on which oob you are using.
>>>>>>> Now a question: is there a way to forbid the mpiexec to use some interfaces at all?
>>>>>>>
>>>>>>> Best wishes,
>>>>>>>
>>>>>>> Paul Kapinos
>>>>>>>
>>>>>>> P.S. Of course we know about the good idea to bring all nodes into the same net on eth0, but at this point it is impossible due of technical reason[s]...
>>>>>>>
>>>>>>> P.S.2 I'm not sure that the issue is really rooted in the above mentioned misconfiguration of eth0, but I have no better idea at this point...
>>>>>>>
>>>>>>>
>>>>>>>>> The map seem to be correctly build, also the output if the daemons seem to be the same (see helloworld.txt)
>>>>>>>> Unfortunately, it appears that OMPI was not built with --enable-debug as there is no debug info in the output. Without a debug installation of OMPI, the ability to determine the problem is pretty limited.
>>>>>>> well, this will be the next option we will activate. We also have another issue here, on (not) using uDAPL..
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>>> You should also try putting that long list of nodes in a hostfile - see if that makes a difference.
>>>>>>>>>> It will process the nodes thru a different code path, so if there is some problem in --host,
>>>>>>>>>> this will tell us.
>>>>>>>>> No, with the host file instead of host list on command line the behaviour is the same.
>>>>>>>>>
>>>>>>>>> But, I just found out that the 1.4.3 does *not* hang on this constellation. The next thing I will try will be the installation of 1.5.4 :o)
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>>
>>>>>>>>> Paul
>>>>>>>>>
>>>>>>>>> P.S. started:
>>>>>>>>>
>>>>>>>>> $ /opt/MPI/openmpi-1.5.3/linux/intel/bin/mpiexec --hostfile hostfile-mini -mca odls_base_verbose 5 --leave-session-attached --display-map helloworld 2>&1 | tee helloworld.txt
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> On Nov 21, 2011, at 9:33 AM, Paul Kapinos wrote:
>>>>>>>>>>> Hello Open MPI volks,
>>>>>>>>>>>
>>>>>>>>>>> We use OpenMPI 1.5.3 on our pretty new 1800+ nodes InfiniBand cluster, and we have some strange hangups if starting OpenMPI processes.
>>>>>>>>>>>
>>>>>>>>>>> The nodes are named linuxbsc001,linuxbsc002,... (with some lacuna due of offline nodes). Each node is accessible from each other over SSH (without password), also MPI programs between any two nodes are checked to run.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> So long, I tried to start some bigger number of processes, one process per node:
>>>>>>>>>>> $ mpiexec -np NN --host linuxbsc001,linuxbsc002,... MPI_FastTest.exe
>>>>>>>>>>>
>>>>>>>>>>> Now the problem: there are some constellations of names in the host list on which mpiexec reproducible hangs forever; and more surprising: other *permutation* of the *same* node names may run without any errors!
>>>>>>>>>>>
>>>>>>>>>>> Example: the command in laueft.txt runs OK, the command in haengt.txt hangs. Note: the only difference is that the node linuxbsc025 is put on the end of the host list. Amazed, too?
>>>>>>>>>>>
>>>>>>>>>>> Looking on the particular nodes during the above mpiexec hangs, we found the orted daemons started on *each* node and the binary on all but one node (orted.txt, MPI_FastTest.txt).
>>>>>>>>>>> Again amazing that the node with no user process started (leading to hangup in MPI_Init of all processes and thus to hangup, I believe) was always the same, linuxbsc005, which is NOT the permuted item linuxbsc025...
>>>>>>>>>>>
>>>>>>>>>>> This behaviour is reproducible. The hang-on only occure if the started application is a MPI application ("hostname" does not hang).
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Any Idea what is gonna on?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Best,
>>>>>>>>>>>
>>>>>>>>>>> Paul Kapinos
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> P.S: no alias names used, all names are real ones
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Dipl.-Inform. Paul Kapinos - High Performance Computing,
>>>>>>>>>>> RWTH Aachen University, Center for Computing and Communication
>>>>>>>>>>> Seffenter Weg 23, D 52074 Aachen (Germany)
>>>>>>>>>>> Tel: +49 241/80-24915
>>>>>>>>>>> linuxbsc001: STDOUT: 24323 ? SLl 0:00 MPI_FastTest.exe
>>>>>>>>>>> linuxbsc002: STDOUT: 2142 ? SLl 0:00 MPI_FastTest.exe
>>>>>>>>>>> linuxbsc003: STDOUT: 69266 ? SLl 0:00 MPI_FastTest.exe
>>>>>>>>>>> linuxbsc004: STDOUT: 58899 ? SLl 0:00 MPI_FastTest.exe
>>>>>>>>>>> linuxbsc006: STDOUT: 68255 ? SLl 0:00 MPI_FastTest.exe
>>>>>>>>>>> linuxbsc007: STDOUT: 62026 ? SLl 0:00 MPI_FastTest.exe
>>>>>>>>>>> linuxbsc008: STDOUT: 54221 ? SLl 0:00 MPI_FastTest.exe
>>>>>>>>>>> linuxbsc009: STDOUT: 55482 ? SLl 0:00 MPI_FastTest.exe
>>>>>>>>>>> linuxbsc010: STDOUT: 59380 ? SLl 0:00 MPI_FastTest.exe
>>>>>>>>>>> linuxbsc011: STDOUT: 58312 ? SLl 0:00 MPI_FastTest.exe
>>>>>>>>>>> linuxbsc014: STDOUT: 56013 ? SLl 0:00 MPI_FastTest.exe
>>>>>>>>>>> linuxbsc016: STDOUT: 58563 ? SLl 0:00 MPI_FastTest.exe
>>>>>>>>>>> linuxbsc017: STDOUT: 54693 ? SLl 0:00 MPI_FastTest.exe
>>>>>>>>>>> linuxbsc018: STDOUT: 54187 ? SLl 0:00 MPI_FastTest.exe
>>>>>>>>>>> linuxbsc020: STDOUT: 55811 ? SLl 0:00 MPI_FastTest.exe
>>>>>>>>>>> linuxbsc021: STDOUT: 54982 ? SLl 0:00 MPI_FastTest.exe
>>>>>>>>>>> linuxbsc022: STDOUT: 50032 ? SLl 0:00 MPI_FastTest.exe
>>>>>>>>>>> linuxbsc023: STDOUT: 54044 ? SLl 0:00 MPI_FastTest.exe
>>>>>>>>>>> linuxbsc024: STDOUT: 51247 ? SLl 0:00 MPI_FastTest.exe
>>>>>>>>>>> linuxbsc025: STDOUT: 18575 ? SLl 0:00 MPI_FastTest.exe
>>>>>>>>>>> linuxbsc027: STDOUT: 48969 ? SLl 0:00 MPI_FastTest.exe
>>>>>>>>>>> linuxbsc028: STDOUT: 52397 ? SLl 0:00 MPI_FastTest.exe
>>>>>>>>>>> linuxbsc029: STDOUT: 52780 ? SLl 0:00 MPI_FastTest.exe
>>>>>>>>>>> linuxbsc030: STDOUT: 47537 ? SLl 0:00 MPI_FastTest.exe
>>>>>>>>>>> linuxbsc031: STDOUT: 54609 ? SLl 0:00 MPI_FastTest.exe
>>>>>>>>>>> linuxbsc032: STDOUT: 52833 ? SLl 0:00 MPI_FastTest.exe
>>>>>>>>>>> $ timex /opt/MPI/openmpi-1.5.3/linux/intel/bin/mpiexec -np 27 --host linuxbsc001,linuxbsc002,linuxbsc003,linuxbsc004,linuxbsc005,linuxbsc006,linuxbsc007,linuxbsc008,linuxbsc009,linuxbsc010,linuxbsc011,linuxbsc014,linuxbsc016,linuxbsc017,linuxbsc018,linuxbsc020,linuxbsc021,linuxbsc022,linuxbsc023,linuxbsc024,linuxbsc025,linuxbsc027,linuxbsc028,linuxbsc029,linuxbsc030,linuxbsc031,linuxbsc032 MPI_FastTest.exe
>>>>>>>>>>> $ timex /opt/MPI/openmpi-1.5.3/linux/intel/bin/mpiexec -np 27 --host linuxbsc001,linuxbsc002,linuxbsc003,linuxbsc004,linuxbsc005,linuxbsc006,linuxbsc007,linuxbsc008,linuxbsc009,linuxbsc010,linuxbsc011,linuxbsc014,linuxbsc016,linuxbsc017,linuxbsc018,linuxbsc020,linuxbsc021,linuxbsc022,linuxbsc023,linuxbsc024,linuxbsc027,linuxbsc028,linuxbsc029,linuxbsc030,linuxbsc031,linuxbsc032,linuxbsc025 MPI_FastTest.exe
>>>>>>>>>>> linuxbsc001: STDOUT: 24322 ? Ss 0:00 /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 1 -mca orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>>>>>> linuxbsc002: STDOUT: 2141 ? Ss 0:00 /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 2 -mca orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>>>>>> linuxbsc003: STDOUT: 69265 ? Ss 0:00 /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 3 -mca orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>>>>>> linuxbsc004: STDOUT: 58898 ? Ss 0:00 /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 4 -mca orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>>>>>> linuxbsc005: STDOUT: 65642 ? Ss 0:00 /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 5 -mca orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>>>>>> linuxbsc006: STDOUT: 68254 ? Ss 0:00 /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 6 -mca orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>>>>>> linuxbsc007: STDOUT: 62025 ? Ss 0:00 /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 7 -mca orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>>>>>> linuxbsc008: STDOUT: 54220 ? Ss 0:00 /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 8 -mca orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>>>>>> linuxbsc009: STDOUT: 55481 ? Ss 0:00 /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 9 -mca orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>>>>>> linuxbsc010: STDOUT: 59379 ? Ss 0:00 /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 10 -mca orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>>>>>> linuxbsc011: STDOUT: 58311 ? Ss 0:00 /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 11 -mca orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>>>>>> linuxbsc014: STDOUT: 56012 ? Ss 0:00 /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 12 -mca orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>>>>>> linuxbsc016: STDOUT: 58562 ? Ss 0:00 /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 13 -mca orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>>>>>> linuxbsc017: STDOUT: 54692 ? Ss 0:00 /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 14 -mca orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>>>>>> linuxbsc018: STDOUT: 54186 ? Ss 0:00 /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 15 -mca orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>>>>>> linuxbsc020: STDOUT: 55810 ? Ss 0:00 /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 16 -mca orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>>>>>> linuxbsc021: STDOUT: 54981 ? Ss 0:00 /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 17 -mca orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>>>>>> linuxbsc022: STDOUT: 50031 ? Ss 0:00 /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 18 -mca orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>>>>>> linuxbsc023: STDOUT: 54043 ? Ss 0:00 /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 19 -mca orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>>>>>> linuxbsc024: STDOUT: 51246 ? Ss 0:00 /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 20 -mca orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>>>>>> linuxbsc025: STDOUT: 18574 ? Ss 0:00 /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 21 -mca orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>>>>>> linuxbsc027: STDOUT: 48968 ? Ss 0:00 /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 22 -mca orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>>>>>> linuxbsc028: STDOUT: 52396 ? Ss 0:00 /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 23 -mca orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>>>>>> linuxbsc029: STDOUT: 52779 ? Ss 0:00 /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 24 -mca orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>>>>>> linuxbsc030: STDOUT: 47536 ? Ss 0:00 /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 25 -mca orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>>>>>> linuxbsc031: STDOUT: 54608 ? Ss 0:00 /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 26 -mca orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>>>>>> linuxbsc032: STDOUT: 52832 ? Ss 0:00 /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 27 -mca orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> users mailing list
>>>>>>>>>>> users_at_[hidden]
>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>> _______________________________________________
>>>>>>>>>> users mailing list
>>>>>>>>>> users_at_[hidden]
>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>> --
>>>>>>>>> Dipl.-Inform. Paul Kapinos - High Performance Computing,
>>>>>>>>> RWTH Aachen University, Center for Computing and Communication
>>>>>>>>> Seffenter Weg 23, D 52074 Aachen (Germany)
>>>>>>>>> Tel: +49 241/80-24915
>>>>>>>>> linuxbsc005 slots=1
>>>>>>>>> linuxbsc006 slots=1
>>>>>>>>> linuxbsc007 slots=1
>>>>>>>>> linuxbsc008 slots=1
>>>>>>>>> linuxbsc009 slots=1
>>>>>>>>> linuxbsc010 slots=1
>>>>>>>>> linuxbsc011 slots=1
>>>>>>>>> linuxbsc014 slots=1
>>>>>>>>> linuxbsc016 slots=1
>>>>>>>>> linuxbsc017 slots=1
>>>>>>>>> linuxbsc018 slots=1
>>>>>>>>> linuxbsc020 slots=1
>>>>>>>>> linuxbsc021 slots=1
>>>>>>>>> linuxbsc022 slots=1
>>>>>>>>> linuxbsc023 slots=1
>>>>>>>>> linuxbsc024 slots=1
>>>>>>>>> linuxbsc025 slots=1[linuxc2.rz.RWTH-Aachen.DE:22229] mca:base:select:( odls) Querying component [default]
>>>>>>>>> [linuxc2.rz.RWTH-Aachen.DE:22229] mca:base:select:( odls) Query of component [default] set priority to 1
>>>>>>>>> [linuxc2.rz.RWTH-Aachen.DE:22229] mca:base:select:( odls) Selected component [default]
>>>>>>>>>
>>>>>>>>> ======================== JOB MAP ========================
>>>>>>>>>
>>>>>>>>> Data for node: linuxbsc005 Num procs: 1
>>>>>>>>> Process OMPI jobid: [87,1] Process rank: 0
>>>>>>>>>
>>>>>>>>> Data for node: linuxbsc006 Num procs: 1
>>>>>>>>> Process OMPI jobid: [87,1] Process rank: 1
>>>>>>>>>
>>>>>>>>> Data for node: linuxbsc007 Num procs: 1
>>>>>>>>> Process OMPI jobid: [87,1] Process rank: 2
>>>>>>>>>
>>>>>>>>> Data for node: linuxbsc008 Num procs: 1
>>>>>>>>> Process OMPI jobid: [87,1] Process rank: 3
>>>>>>>>>
>>>>>>>>> Data for node: linuxbsc009 Num procs: 1
>>>>>>>>> Process OMPI jobid: [87,1] Process rank: 4
>>>>>>>>>
>>>>>>>>> Data for node: linuxbsc010 Num procs: 1
>>>>>>>>> Process OMPI jobid: [87,1] Process rank: 5
>>>>>>>>>
>>>>>>>>> Data for node: linuxbsc011 Num procs: 1
>>>>>>>>> Process OMPI jobid: [87,1] Process rank: 6
>>>>>>>>>
>>>>>>>>> Data for node: linuxbsc014 Num procs: 1
>>>>>>>>> Process OMPI jobid: [87,1] Process rank: 7
>>>>>>>>>
>>>>>>>>> Data for node: linuxbsc016 Num procs: 1
>>>>>>>>> Process OMPI jobid: [87,1] Process rank: 8
>>>>>>>>>
>>>>>>>>> Data for node: linuxbsc017 Num procs: 1
>>>>>>>>> Process OMPI jobid: [87,1] Process rank: 9
>>>>>>>>>
>>>>>>>>> Data for node: linuxbsc018 Num procs: 1
>>>>>>>>> Process OMPI jobid: [87,1] Process rank: 10
>>>>>>>>>
>>>>>>>>> Data for node: linuxbsc020 Num procs: 1
>>>>>>>>> Process OMPI jobid: [87,1] Process rank: 11
>>>>>>>>>
>>>>>>>>> Data for node: linuxbsc021 Num procs: 1
>>>>>>>>> Process OMPI jobid: [87,1] Process rank: 12
>>>>>>>>>
>>>>>>>>> Data for node: linuxbsc022 Num procs: 1
>>>>>>>>> Process OMPI jobid: [87,1] Process rank: 13
>>>>>>>>>
>>>>>>>>> Data for node: linuxbsc023 Num procs: 1
>>>>>>>>> Process OMPI jobid: [87,1] Process rank: 14
>>>>>>>>>
>>>>>>>>> Data for node: linuxbsc024 Num procs: 1
>>>>>>>>> Process OMPI jobid: [87,1] Process rank: 15
>>>>>>>>>
>>>>>>>>> Data for node: linuxbsc025 Num procs: 1
>>>>>>>>> Process OMPI jobid: [87,1] Process rank: 16
>>>>>>>>>
>>>>>>>>> =============================================================
>>>>>>>>> [linuxbsc007.rz.RWTH-Aachen.DE:07574] mca:base:select:( odls) Querying component [default]
>>>>>>>>> [linuxbsc007.rz.RWTH-Aachen.DE:07574] mca:base:select:( odls) Query of component [default] set priority to 1
>>>>>>>>> [linuxbsc007.rz.RWTH-Aachen.DE:07574] mca:base:select:( odls) Selected component [default]
>>>>>>>>> [linuxbsc016.rz.RWTH-Aachen.DE:03146] mca:base:select:( odls) Querying component [default]
>>>>>>>>> [linuxbsc016.rz.RWTH-Aachen.DE:03146] mca:base:select:( odls) Query of component [default] set priority to 1
>>>>>>>>> [linuxbsc016.rz.RWTH-Aachen.DE:03146] mca:base:select:( odls) Selected component [default]
>>>>>>>>> [linuxbsc005.rz.RWTH-Aachen.DE:22051] mca:base:select:( odls) Querying component [default]
>>>>>>>>> [linuxbsc005.rz.RWTH-Aachen.DE:22051] mca:base:select:( odls) Query of component [default] set priority to 1
>>>>>>>>> [linuxbsc005.rz.RWTH-Aachen.DE:22051] mca:base:select:( odls) Selected component [default]
>>>>>>>>> [linuxbsc011.rz.RWTH-Aachen.DE:07131] mca:base:select:( odls) Querying component [default]
>>>>>>>>> [linuxbsc011.rz.RWTH-Aachen.DE:07131] mca:base:select:( odls) Query of component [default] set priority to 1
>>>>>>>>> [linuxbsc011.rz.RWTH-Aachen.DE:07131] mca:base:select:( odls) Selected component [default]
>>>>>>>>> [linuxbsc025.rz.RWTH-Aachen.DE:43153] mca:base:select:( odls) Querying component [default]
>>>>>>>>> [linuxbsc025.rz.RWTH-Aachen.DE:43153] mca:base:select:( odls) Query of component [default] set priority to 1
>>>>>>>>> [linuxbsc025.rz.RWTH-Aachen.DE:43153] mca:base:select:( odls) Selected component [default]
>>>>>>>>> [linuxbsc017.rz.RWTH-Aachen.DE:05044] mca:base:select:( odls) Querying component [default]
>>>>>>>>> [linuxbsc017.rz.RWTH-Aachen.DE:05044] mca:base:select:( odls) Query of component [default] set priority to 1
>>>>>>>>> [linuxbsc017.rz.RWTH-Aachen.DE:05044] mca:base:select:( odls) Selected component [default]
>>>>>>>>> [linuxbsc018.rz.RWTH-Aachen.DE:01840] mca:base:select:( odls) Querying component [default]
>>>>>>>>> [linuxbsc018.rz.RWTH-Aachen.DE:01840] mca:base:select:( odls) Query of component [default] set priority to 1
>>>>>>>>> [linuxbsc018.rz.RWTH-Aachen.DE:01840] mca:base:select:( odls) Selected component [default]
>>>>>>>>> [linuxbsc024.rz.RWTH-Aachen.DE:79549] mca:base:select:( odls) Querying component [default]
>>>>>>>>> [linuxbsc024.rz.RWTH-Aachen.DE:79549] mca:base:select:( odls) Query of component [default] set priority to 1
>>>>>>>>> [linuxbsc024.rz.RWTH-Aachen.DE:79549] mca:base:select:( odls) Selected component [default]
>>>>>>>>> [linuxbsc022.rz.RWTH-Aachen.DE:73501] mca:base:select:( odls) Querying component [default]
>>>>>>>>> [linuxbsc022.rz.RWTH-Aachen.DE:73501] mca:base:select:( odls) Query of component [default] set priority to 1
>>>>>>>>> [linuxbsc022.rz.RWTH-Aachen.DE:73501] mca:base:select:( odls) Selected component [default]
>>>>>>>>> [linuxbsc023.rz.RWTH-Aachen.DE:03364] mca:base:select:( odls) Querying component [default]
>>>>>>>>> [linuxbsc023.rz.RWTH-Aachen.DE:03364] mca:base:select:( odls) Query of component [default] set priority to 1
>>>>>>>>> [linuxbsc023.rz.RWTH-Aachen.DE:03364] mca:base:select:( odls) Selected component [default]
>>>>>>>>> [linuxbsc006.rz.RWTH-Aachen.DE:16811] mca:base:select:( odls) Querying component [default]
>>>>>>>>> [linuxbsc006.rz.RWTH-Aachen.DE:16811] mca:base:select:( odls) Query of component [default] set priority to 1
>>>>>>>>> [linuxbsc006.rz.RWTH-Aachen.DE:16811] mca:base:select:( odls) Selected component [default]
>>>>>>>>> [linuxbsc014.rz.RWTH-Aachen.DE:10206] mca:base:select:( odls) Querying component [default]
>>>>>>>>> [linuxbsc014.rz.RWTH-Aachen.DE:10206] mca:base:select:( odls) Query of component [default] set priority to 1
>>>>>>>>> [linuxbsc014.rz.RWTH-Aachen.DE:10206] mca:base:select:( odls) Selected component [default]
>>>>>>>>> [linuxbsc008.rz.RWTH-Aachen.DE:00858] mca:base:select:( odls) Querying component [default]
>>>>>>>>> [linuxbsc008.rz.RWTH-Aachen.DE:00858] mca:base:select:( odls) Query of component [default] set priority to 1
>>>>>>>>> [linuxbsc008.rz.RWTH-Aachen.DE:00858] mca:base:select:( odls) Selected component [default]
>>>>>>>>> [linuxbsc010.rz.RWTH-Aachen.DE:09727] mca:base:select:( odls) Querying component [default]
>>>>>>>>> [linuxbsc010.rz.RWTH-Aachen.DE:09727] mca:base:select:( odls) Query of component [default] set priority to 1
>>>>>>>>> [linuxbsc010.rz.RWTH-Aachen.DE:09727] mca:base:select:( odls) Selected component [default]
>>>>>>>>> [linuxbsc020.rz.RWTH-Aachen.DE:06680] mca:base:select:( odls) Querying component [default]
>>>>>>>>> [linuxbsc020.rz.RWTH-Aachen.DE:06680] mca:base:select:( odls) Query of component [default] set priority to 1
>>>>>>>>> [linuxbsc020.rz.RWTH-Aachen.DE:06680] mca:base:select:( odls) Selected component [default]
>>>>>>>>> [linuxbsc009.rz.RWTH-Aachen.DE:05145] mca:base:select:( odls) Querying component [default]
>>>>>>>>> [linuxbsc009.rz.RWTH-Aachen.DE:05145] mca:base:select:( odls) Query of component [default] set priority to 1
>>>>>>>>> [linuxbsc009.rz.RWTH-Aachen.DE:05145] mca:base:select:( odls) Selected component [default]
>>>>>>>>> [linuxbsc021.rz.RWTH-Aachen.DE:01405] mca:base:select:( odls) Querying component [default]
>>>>>>>>> [linuxbsc021.rz.RWTH-Aachen.DE:01405] mca:base:select:( odls) Query of component [default] set priority to 1
>>>>>>>>> [linuxbsc021.rz.RWTH-Aachen.DE:01405] mca:base:select:( odls) Selected component [default]
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> users_at_[hidden]
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> users_at_[hidden]
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> users_at_[hidden]
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> --
>>>>>> <Mail Attachment.gif>
>>>>>> Terry D. Dontje | Principal Software Engineer
>>>>>> Developer Tools Engineering | +1.781.442.2631
>>>>>> Oracle * - Performance Technologies*
>>>>>> 95 Network Drive, Burlington, MA 01803
>>>>>> Email terry.dontje_at_[hidden] <mailto:terry.dontje_at_[hidden]>
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> users_at_[hidden] <mailto:users_at_[hidden]>
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> ------------------------------------------------------------------------
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>> --
>>>> Dipl.-Inform. Paul Kapinos - High Performance Computing,
>>>> RWTH Aachen University, Center for Computing and Communication
>>>> Seffenter Weg 23, D 52074 Aachen (Germany)
>>>> Tel: +49 241/80-24915
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> --
>> Dipl.-Inform. Paul Kapinos - High Performance Computing,
>> RWTH Aachen University, Center for Computing and Communication
>> Seffenter Weg 23, D 52074 Aachen (Germany)
>> Tel: +49 241/80-24915
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/