Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] ERROR: At least one pair of MPI processes are unable to reach each other for MPI communications.
From: RoboBeans (robobeans_at_[hidden])
Date: 2013-08-03 22:14:39


On issuing ibhosts command I can see this:

*# ibhosts | sort*

Ca : 0x002288000070a432 ports 2 "sv-2 qib0"
Ca : 0x002288000070a47c ports 2 "sv-3 qib0"
Ca : 0x002288000070a4a8 ports 2 "sv-1 qib0"
Ca : 0x002288000077ca2c ports 1 "@ HCA-1"
Ca : 0x002288000077d7f4 ports 1 "SERVER-14 HCA-1"
Ca : 0x002288000077f530 ports 1 "@ HCA-1"
Ca : 0x002288000077f92c ports 1 "@ HCA-1"
Ca : 0x0022880000784f54 ports 2 "sv-4 qib0"
Ca : 0x002288000078a946 ports 1 "@ HCA-1"
Ca : 0x002288000078af7e ports 1 "@ HCA-1"
Ca : 0x002288000079806a ports 1 "@ HCA-1"
Ca : 0x002288000079a2b4 ports 1 "@ HCA-1"

Thanks!

On 8/3/13 7:09 PM, RoboBeans wrote:
> On first 7 nodes:
>
> *[mpidemo_at_SERVER-3 ~]$ ofed_info | head -n 1*
> OFED-1.5.3.2:
>
> *[mpidemo_at_SERVER-3 ~]$ which ofed_info*
> /usr/bin/ofed_info
>
> On last 4 nodes:
>
> *[mpidemo_at_sv-2 ~]$ ofed_info | head -n 1*
> -bash: ofed_info: command not found
>
> *[mpidemo_at_sv-2 ~]$ which ofed_info*
> /usr/bin/which: no ofed_info in
> (/usr/OPENMPI/openmpi-1.7.2/bin:/usr/OPENMPI/openmpi-1.7.2/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/bin/:/usr/lib/:/usr/lib:/usr:/usr/:/bin/:/usr/lib/:/usr/lib:/usr:/usr/)
>
>
> Are there some specific locations where I should look for ofed_info?
> How can I make sure that ofed was installed on a node or not?
>
> Thanks again!!!
>
>
> On 8/3/13 5:52 PM, Ralph Castain wrote:
>> Are the ofed versions the same across all the machines? I would
>> suspect that might be the problem.
>>
>>
>> On Aug 3, 2013, at 4:06 PM, RoboBeans <robobeans_at_[hidden]
>> <mailto:robobeans_at_[hidden]>> wrote:
>>
>>> Hi Ralph, I tried using 1.5.4, 1.6.5 and 1.7.2 (compiled from source
>>> code) with no configuration arguments but I am facing the same
>>> issue. When I run a job using 1.5.4 (installed using yum), I get
>>> warnings but it doesn't affect my output.
>>>
>>> Example of warning that I get:
>>>
>>> sv-2.7960ipath_userinit: Mismatched user minor version (12) and
>>> driver minor version (11) while context sharing. Ensure that driver
>>> and library are from the same release.
>>>
>>> Each system has a QLogic card ("QLE7342-CK dual port IB card"), with
>>> the same OS but different kernel revision no. (e.g.
>>> 2.6.32-358.2.1.el6.x86_64, 2.6.32-358.el6.x86_64).
>>>
>>> Thank you for your time.
>>>
>>> On 8/3/13 2:05 PM, Ralph Castain wrote:
>>>> Hmmm...strange indeed. I would remove those four configure options
>>>> and give it a try. That will eliminate all the obvious things, I
>>>> would think, though they aren't generally involved in the issue
>>>> shown here. Still, worth taking out potential trouble sources.
>>>>
>>>> What is the connectivity between SERVER-2 and node 100? Should I
>>>> assume that the first seven nodes are connected via one type of
>>>> interconnect, and the other four are connected to those seven by
>>>> another type?
>>>>
>>>>
>>>> On Aug 3, 2013, at 1:30 PM, RoboBeans <robobeans_at_[hidden]
>>>> <mailto:robobeans_at_[hidden]>> wrote:
>>>>
>>>>> Thanks for looking into in Ralph. I modified the hosts file but I
>>>>> am still getting the same error. Any other pointers you can think
>>>>> of? The difference between this 1.7.2 installation and 1.5.4 is
>>>>> that I installed 1.5.4 using yum and for 1.7.2, I used the source
>>>>> code and configured with *--enable-event-thread-support
>>>>> --enable-opal-multi-threads --enable-orte-progress-threads
>>>>> --enable-mpi-thread-multiple**
>>>>> *. Am I missing something here?
>>>>>
>>>>> //******************************************************************
>>>>>
>>>>> *$ cat mpi_hostfile*
>>>>>
>>>>> x.x.x.22 slots=15 max-slots=15
>>>>> x.x.x.24 slots=2 max-slots=2
>>>>> x.x.x.26 slots=14 max-slots=14
>>>>> x.x.x.28 slots=16 max-slots=16
>>>>> x.x.x.29 slots=14 max-slots=14
>>>>> x.x.x.30 slots=16 max-slots=16
>>>>> x.x.x.41 slots=46 max-slots=46
>>>>> x.x.x.101 slots=46 max-slots=46
>>>>> x.x.x.100 slots=46 max-slots=46
>>>>> x.x.x.102 slots=22 max-slots=22
>>>>> x.x.x.103 slots=22 max-slots=22
>>>>>
>>>>> //******************************************************************
>>>>> *$ mpirun -d --display-map -np 10 --hostfile mpi_hostfile --bynode
>>>>> ./test**
>>>>> *
>>>>> [SERVER-2:08907] procdir:
>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-2_0/62216/0/0
>>>>> [SERVER-2:08907] jobdir:
>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-2_0/62216/0
>>>>> [SERVER-2:08907] top: openmpi-sessions-mpidemo_at_SERVER-2_0
>>>>> [SERVER-2:08907] tmp: /tmp
>>>>> CentOS release 6.4 (Final)
>>>>> Kernel \r on an \m
>>>>> CentOS release 6.4 (Final)
>>>>> Kernel \r on an \m
>>>>> CentOS release 6.4 (Final)
>>>>> Kernel \r on an \m
>>>>> [SERVER-3:32517] procdir:
>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-3_0/62216/0/1
>>>>> [SERVER-3:32517] jobdir:
>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-3_0/62216/0
>>>>> [SERVER-3:32517] top: openmpi-sessions-mpidemo_at_SERVER-3_0
>>>>> [SERVER-3:32517] tmp: /tmp
>>>>> CentOS release 6.4 (Final)
>>>>> Kernel \r on an \m
>>>>> CentOS release 6.4 (Final)
>>>>> Kernel \r on an \m
>>>>> [SERVER-6:11595] procdir:
>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-6_0/62216/0/4
>>>>> [SERVER-6:11595] jobdir:
>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-6_0/62216/0
>>>>> [SERVER-6:11595] top: openmpi-sessions-mpidemo_at_SERVER-6_0
>>>>> [SERVER-6:11595] tmp: /tmp
>>>>> [SERVER-4:27445] procdir:
>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-4_0/62216/0/2
>>>>> [SERVER-4:27445] jobdir:
>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-4_0/62216/0
>>>>> [SERVER-4:27445] top: openmpi-sessions-mpidemo_at_SERVER-4_0
>>>>> [SERVER-4:27445] tmp: /tmp
>>>>> [SERVER-7:02607] procdir:
>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-7_0/62216/0/5
>>>>> [SERVER-7:02607] jobdir:
>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-7_0/62216/0
>>>>> [SERVER-7:02607] top: openmpi-sessions-mpidemo_at_SERVER-7_0
>>>>> [SERVER-7:02607] tmp: /tmp
>>>>> [sv-1:46100] procdir: /tmp/openmpi-sessions-mpidemo_at_sv-1_0/62216/0/8
>>>>> [sv-1:46100] jobdir: /tmp/openmpi-sessions-mpidemo_at_sv-1_0/62216/0
>>>>> [sv-1:46100] top: openmpi-sessions-mpidemo_at_sv-1_0
>>>>> [sv-1:46100] tmp: /tmp
>>>>> CentOS release 6.4 (Final)
>>>>> Kernel \r on an \m
>>>>> [SERVER-5:16404] procdir:
>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-5_0/62216/0/3
>>>>> [SERVER-5:16404] jobdir:
>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-5_0/62216/0
>>>>> [SERVER-5:16404] top: openmpi-sessions-mpidemo_at_SERVER-5_0
>>>>> [SERVER-5:16404] tmp: /tmp
>>>>> [sv-3:08575] procdir: /tmp/openmpi-sessions-mpidemo_at_sv-3_0/62216/0/9
>>>>> [sv-3:08575] jobdir: /tmp/openmpi-sessions-mpidemo_at_sv-3_0/62216/0
>>>>> [sv-3:08575] top: openmpi-sessions-mpidemo_at_sv-3_0
>>>>> [sv-3:08575] tmp: /tmp
>>>>> [SERVER-14:10755] procdir:
>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-14_0/62216/0/6
>>>>> [SERVER-14:10755] jobdir:
>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-14_0/62216/0
>>>>> [SERVER-14:10755] top: openmpi-sessions-mpidemo_at_SERVER-14_0
>>>>> [SERVER-14:10755] tmp: /tmp
>>>>> [sv-4:12040] procdir: /tmp/openmpi-sessions-mpidemo_at_sv-4_0/62216/0/10
>>>>> [sv-4:12040] jobdir: /tmp/openmpi-sessions-mpidemo_at_sv-4_0/62216/0
>>>>> [sv-4:12040] top: openmpi-sessions-mpidemo_at_sv-4_0
>>>>> [sv-4:12040] tmp: /tmp
>>>>> [sv-2:07725] procdir: /tmp/openmpi-sessions-mpidemo_at_sv-2_0/62216/0/7
>>>>> [sv-2:07725] jobdir: /tmp/openmpi-sessions-mpidemo_at_sv-2_0/62216/0
>>>>> [sv-2:07725] top: openmpi-sessions-mpidemo_at_sv-2_0
>>>>> [sv-2:07725] tmp: /tmp
>>>>>
>>>>> Mapper requested: NULL Last mapper: round_robin Mapping policy:
>>>>> BYNODE Ranking policy: NODE Binding policy: NONE[NODE] Cpu set:
>>>>> NULL PPR: NULL
>>>>> Num new daemons: 0 New daemon starting vpid INVALID
>>>>> Num nodes: 10
>>>>>
>>>>> Data for node: SERVER-2 Launch id: -1 State: 2
>>>>> Daemon: [[62216,0],0] Daemon launched: True
>>>>> Num slots: 15 Slots in use: 1 Oversubscribed: FALSE
>>>>> Num slots allocated: 15 Max slots: 15
>>>>> Username on node: NULL
>>>>> Num procs: 1 Next node_rank: 1
>>>>> Data for proc: [[62216,1],0]
>>>>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 0
>>>>> State: INITIALIZED Restarts: 0 App_context: 0
>>>>> Locale: 0-15 Binding: NULL[0]
>>>>>
>>>>> Data for node: x.x.x.24 Launch id: -1 State: 0
>>>>> Daemon: [[62216,0],1] Daemon launched: False
>>>>> Num slots: 2 Slots in use: 1 Oversubscribed: FALSE
>>>>> Num slots allocated: 2 Max slots: 2
>>>>> Username on node: NULL
>>>>> Num procs: 1 Next node_rank: 1
>>>>> Data for proc: [[62216,1],1]
>>>>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 1
>>>>> State: INITIALIZED Restarts: 0 App_context: 0
>>>>> Locale: 0-7 Binding: NULL[0]
>>>>>
>>>>> Data for node: x.x.x.26 Launch id: -1 State: 0
>>>>> Daemon: [[62216,0],2] Daemon launched: False
>>>>> Num slots: 14 Slots in use: 1 Oversubscribed: FALSE
>>>>> Num slots allocated: 14 Max slots: 14
>>>>> Username on node: NULL
>>>>> Num procs: 1 Next node_rank: 1
>>>>> Data for proc: [[62216,1],2]
>>>>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 2
>>>>> State: INITIALIZED Restarts: 0 App_context: 0
>>>>> Locale: 0-7 Binding: NULL[0]
>>>>>
>>>>> Data for node: x.x.x.28 Launch id: -1 State: 0
>>>>> Daemon: [[62216,0],3] Daemon launched: False
>>>>> Num slots: 16 Slots in use: 1 Oversubscribed: FALSE
>>>>> Num slots allocated: 16 Max slots: 16
>>>>> Username on node: NULL
>>>>> Num procs: 1 Next node_rank: 1
>>>>> Data for proc: [[62216,1],3]
>>>>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 3
>>>>> State: INITIALIZED Restarts: 0 App_context: 0
>>>>> Locale: 0-7 Binding: NULL[0]
>>>>>
>>>>> Data for node: x.x.x.29 Launch id: -1 State: 0
>>>>> Daemon: [[62216,0],4] Daemon launched: False
>>>>> Num slots: 14 Slots in use: 1 Oversubscribed: FALSE
>>>>> Num slots allocated: 14 Max slots: 14
>>>>> Username on node: NULL
>>>>> Num procs: 1 Next node_rank: 1
>>>>> Data for proc: [[62216,1],4]
>>>>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 4
>>>>> State: INITIALIZED Restarts: 0 App_context: 0
>>>>> Locale: 0-7 Binding: NULL[0]
>>>>>
>>>>> Data for node: x.x.x.30 Launch id: -1 State: 0
>>>>> Daemon: [[62216,0],5] Daemon launched: False
>>>>> Num slots: 16 Slots in use: 1 Oversubscribed: FALSE
>>>>> Num slots allocated: 16 Max slots: 16
>>>>> Username on node: NULL
>>>>> Num procs: 1 Next node_rank: 1
>>>>> Data for proc: [[62216,1],5]
>>>>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 5
>>>>> State: INITIALIZED Restarts: 0 App_context: 0
>>>>> Locale: 0-7 Binding: NULL[0]
>>>>>
>>>>> Data for node: x.x.x.41 Launch id: -1 State: 0
>>>>> Daemon: [[62216,0],6] Daemon launched: False
>>>>> Num slots: 46 Slots in use: 1 Oversubscribed: FALSE
>>>>> Num slots allocated: 46 Max slots: 46
>>>>> Username on node: NULL
>>>>> Num procs: 1 Next node_rank: 1
>>>>> Data for proc: [[62216,1],6]
>>>>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 6
>>>>> State: INITIALIZED Restarts: 0 App_context: 0
>>>>> Locale: 0-7 Binding: NULL[0]
>>>>>
>>>>> Data for node: x.x.x.101 Launch id: -1 State: 0
>>>>> Daemon: [[62216,0],7] Daemon launched: False
>>>>> Num slots: 46 Slots in use: 1 Oversubscribed: FALSE
>>>>> Num slots allocated: 46 Max slots: 46
>>>>> Username on node: NULL
>>>>> Num procs: 1 Next node_rank: 1
>>>>> Data for proc: [[62216,1],7]
>>>>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 7
>>>>> State: INITIALIZED Restarts: 0 App_context: 0
>>>>> Locale: 0-7 Binding: NULL[0]
>>>>>
>>>>> Data for node: x.x.x.100 Launch id: -1 State: 0
>>>>> Daemon: [[62216,0],8] Daemon launched: False
>>>>> Num slots: 46 Slots in use: 1 Oversubscribed: FALSE
>>>>> Num slots allocated: 46 Max slots: 46
>>>>> Username on node: NULL
>>>>> Num procs: 1 Next node_rank: 1
>>>>> Data for proc: [[62216,1],8]
>>>>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 8
>>>>> State: INITIALIZED Restarts: 0 App_context: 0
>>>>> Locale: 0-7 Binding: NULL[0]
>>>>>
>>>>> Data for node: x.x.x.102 Launch id: -1 State: 0
>>>>> Daemon: [[62216,0],9] Daemon launched: False
>>>>> Num slots: 22 Slots in use: 1 Oversubscribed: FALSE
>>>>> Num slots allocated: 22 Max slots: 22
>>>>> Username on node: NULL
>>>>> Num procs: 1 Next node_rank: 1
>>>>> Data for proc: [[62216,1],9]
>>>>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 9
>>>>> State: INITIALIZED Restarts: 0 App_context: 0
>>>>> Locale: 0-7 Binding: NULL[0]
>>>>> [sv-1:46111] procdir: /tmp/openmpi-sessions-mpidemo_at_sv-1_0/62216/1/8
>>>>> [sv-1:46111] jobdir: /tmp/openmpi-sessions-mpidemo_at_sv-1_0/62216/1
>>>>> [sv-1:46111] top: openmpi-sessions-mpidemo_at_sv-1_0
>>>>> [sv-1:46111] tmp: /tmp
>>>>> [SERVER-14:10768] procdir:
>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-14_0/62216/1/6
>>>>> [SERVER-14:10768] jobdir:
>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-14_0/62216/1
>>>>> [SERVER-14:10768] top: openmpi-sessions-mpidemo_at_SERVER-14_0
>>>>> [SERVER-14:10768] tmp: /tmp
>>>>> [SERVER-2:08912] procdir:
>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-2_0/62216/1/0
>>>>> [SERVER-2:08912] jobdir:
>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-2_0/62216/1
>>>>> [SERVER-2:08912] top: openmpi-sessions-mpidemo_at_SERVER-2_0
>>>>> [SERVER-2:08912] tmp: /tmp
>>>>> [SERVER-4:27460] procdir:
>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-4_0/62216/1/2
>>>>> [SERVER-4:27460] jobdir:
>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-4_0/62216/1
>>>>> [SERVER-4:27460] top: openmpi-sessions-mpidemo_at_SERVER-4_0
>>>>> [SERVER-4:27460] tmp: /tmp
>>>>> [SERVER-6:11608] procdir:
>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-6_0/62216/1/4
>>>>> [SERVER-6:11608] jobdir:
>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-6_0/62216/1
>>>>> [SERVER-6:11608] top: openmpi-sessions-mpidemo_at_SERVER-6_0
>>>>> [SERVER-6:11608] tmp: /tmp
>>>>> [SERVER-7:02620] procdir:
>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-7_0/62216/1/5
>>>>> [SERVER-7:02620] jobdir:
>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-7_0/62216/1
>>>>> [SERVER-7:02620] top: openmpi-sessions-mpidemo_at_SERVER-7_0
>>>>> [SERVER-7:02620] tmp: /tmp
>>>>> [sv-3:08586] procdir: /tmp/openmpi-sessions-mpidemo_at_sv-3_0/62216/1/9
>>>>> [sv-3:08586] jobdir: /tmp/openmpi-sessions-mpidemo_at_sv-3_0/62216/1
>>>>> [sv-3:08586] top: openmpi-sessions-mpidemo_at_sv-3_0
>>>>> [sv-3:08586] tmp: /tmp
>>>>> [sv-2:07736] procdir: /tmp/openmpi-sessions-mpidemo_at_sv-2_0/62216/1/7
>>>>> [sv-2:07736] jobdir: /tmp/openmpi-sessions-mpidemo_at_sv-2_0/62216/1
>>>>> [sv-2:07736] top: openmpi-sessions-mpidemo_at_sv-2_0
>>>>> [sv-2:07736] tmp: /tmp
>>>>> [SERVER-5:16418] procdir:
>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-5_0/62216/1/3
>>>>> [SERVER-5:16418] jobdir:
>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-5_0/62216/1
>>>>> [SERVER-5:16418] top: openmpi-sessions-mpidemo_at_SERVER-5_0
>>>>> [SERVER-5:16418] tmp: /tmp
>>>>> [SERVER-3:32533] procdir:
>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-3_0/62216/1/1
>>>>> [SERVER-3:32533] jobdir:
>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-3_0/62216/1
>>>>> [SERVER-3:32533] top: openmpi-sessions-mpidemo_at_SERVER-3_0
>>>>> [SERVER-3:32533] tmp: /tmp
>>>>> MPIR_being_debugged = 0
>>>>> MPIR_debug_state = 1
>>>>> MPIR_partial_attach_ok = 1
>>>>> MPIR_i_am_starter = 0
>>>>> MPIR_forward_output = 0
>>>>> MPIR_proctable_size = 10
>>>>> MPIR_proctable:
>>>>> (i, host, exe, pid) = (0, SERVER-2,
>>>>> /usr2/mpidemo/dev/DISTRIBUTED_COMPUTING/./test, 8912)
>>>>> (i, host, exe, pid) = (1, x.x.x.24,
>>>>> /usr2/mpidemo/dev/DISTRIBUTED_COMPUTING/./test, 32533)
>>>>> (i, host, exe, pid) = (2, x.x.x.26,
>>>>> /usr2/mpidemo/dev/DISTRIBUTED_COMPUTING/./test, 27460)
>>>>> (i, host, exe, pid) = (3, x.x.x.28,
>>>>> /usr2/mpidemo/dev/DISTRIBUTED_COMPUTING/./test, 16418)
>>>>> (i, host, exe, pid) = (4, x.x.x.29,
>>>>> /usr2/mpidemo/dev/DISTRIBUTED_COMPUTING/./test, 11608)
>>>>> (i, host, exe, pid) = (5, x.x.x.30,
>>>>> /usr2/mpidemo/dev/DISTRIBUTED_COMPUTING/./test, 2620)
>>>>> (i, host, exe, pid) = (6, x.x.x.41,
>>>>> /usr2/mpidemo/dev/DISTRIBUTED_COMPUTING/./test, 10768)
>>>>> (i, host, exe, pid) = (7, x.x.x.101,
>>>>> /usr2/mpidemo/dev/DISTRIBUTED_COMPUTING/./test, 7736)
>>>>> (i, host, exe, pid) = (8, x.x.x.100,
>>>>> /usr2/mpidemo/dev/DISTRIBUTED_COMPUTING/./test, 46111)
>>>>> (i, host, exe, pid) = (9, x.x.x.102,
>>>>> /usr2/mpidemo/dev/DISTRIBUTED_COMPUTING/./test, 8586)
>>>>> MPIR_executable_path: NULL
>>>>> MPIR_server_arguments: NULL
>>>>> --------------------------------------------------------------------------
>>>>> It looks like MPI_INIT failed for some reason; your parallel
>>>>> process is
>>>>> likely to abort. There are many reasons that a parallel process can
>>>>> fail during MPI_INIT; some of which are due to configuration or
>>>>> environment
>>>>> problems. This failure appears to be an internal failure; here's some
>>>>> additional information (which may only be relevant to an Open MPI
>>>>> developer):
>>>>>
>>>>> PML add procs failed
>>>>> --> Returned "Error" (-1) instead of "Success" (0)
>>>>> --------------------------------------------------------------------------
>>>>> [SERVER-2:8912] *** An error occurred in MPI_Init
>>>>> [SERVER-2:8912] *** reported by process
>>>>> [140393673392129,140389596004352]
>>>>> [SERVER-2:8912] *** on a NULL communicator
>>>>> [SERVER-2:8912] *** Unknown error
>>>>> [SERVER-2:8912] *** MPI_ERRORS_ARE_FATAL (processes in this
>>>>> communicator will now abort,
>>>>> [SERVER-2:8912] *** and potentially your MPI job)
>>>>> --------------------------------------------------------------------------
>>>>> An MPI process is aborting at a time when it cannot guarantee that all
>>>>> of its peer processes in the job will be killed properly. You should
>>>>> double check that everything has shut down cleanly.
>>>>>
>>>>> Reason: Before MPI_INIT completed
>>>>> Local host: SERVER-2
>>>>> PID: 8912
>>>>> --------------------------------------------------------------------------
>>>>> [sv-1][[62216,1],8][btl_openib_proc.c:157:mca_btl_openib_proc_create]
>>>>> [btl_openib_proc.c:157] ompi_modex_recv failed for peer [[62216,1],0]
>>>>> [sv-1][[62216,1],8][btl_tcp_proc.c:128:mca_btl_tcp_proc_create]
>>>>> mca_base_modex_recv: failed with return value=-13
>>>>> [sv-1][[62216,1],8][btl_tcp_proc.c:128:mca_btl_tcp_proc_create]
>>>>> mca_base_modex_recv: failed with return value=-13
>>>>> --------------------------------------------------------------------------
>>>>> At least one pair of MPI processes are unable to reach each other for
>>>>> MPI communications. This means that no Open MPI device has indicated
>>>>> that it can be used to communicate between these processes. This is
>>>>> an error; Open MPI requires that all MPI processes be able to reach
>>>>> each other. This error can sometimes be the result of forgetting to
>>>>> specify the "self" BTL.
>>>>>
>>>>> Process 1 ([[62216,1],8]) is on host: sv-1
>>>>> Process 2 ([[62216,1],0]) is on host: SERVER-2
>>>>> BTLs attempted: openib self sm tcp
>>>>>
>>>>> Your MPI job is now going to abort; sorry.
>>>>> --------------------------------------------------------------------------
>>>>> [sv-3][[62216,1],9][btl_openib_proc.c:157:mca_btl_openib_proc_create]
>>>>> [btl_openib_proc.c:157] ompi_modex_recv failed for peer [[62216,1],0]
>>>>> [sv-3][[62216,1],9][btl_tcp_proc.c:128:mca_btl_tcp_proc_create]
>>>>> mca_base_modex_recv: failed with return value=-13
>>>>> [sv-3][[62216,1],9][btl_tcp_proc.c:128:mca_btl_tcp_proc_create]
>>>>> mca_base_modex_recv: failed with return value=-13
>>>>> --------------------------------------------------------------------------
>>>>> MPI_INIT has failed because at least one MPI process is unreachable
>>>>> from another. This *usually* means that an underlying communication
>>>>> plugin -- such as a BTL or an MTL -- has either not loaded or not
>>>>> allowed itself to be used. Your MPI job will now abort.
>>>>>
>>>>> You may wish to try to narrow down the problem;
>>>>>
>>>>> * Check the output of ompi_info to see which BTL/MTL plugins are
>>>>> available.
>>>>> * Run your application with MPI_THREAD_SINGLE.
>>>>> * Set the MCA parameter btl_base_verbose to 100 (or mtl_base_verbose,
>>>>> if using MTL-based communications) to see exactly which
>>>>> communication plugins were considered and/or discarded.
>>>>> --------------------------------------------------------------------------
>>>>> [sv-2][[62216,1],7][btl_openib_proc.c:157:mca_btl_openib_proc_create]
>>>>> [btl_openib_proc.c:157] ompi_modex_recv failed for peer [[62216,1],0]
>>>>> [sv-2][[62216,1],7][btl_tcp_proc.c:128:mca_btl_tcp_proc_create]
>>>>> mca_base_modex_recv: failed with return value=-13
>>>>> [sv-2][[62216,1],7][btl_tcp_proc.c:128:mca_btl_tcp_proc_create]
>>>>> mca_base_modex_recv: failed with return value=-13
>>>>> [SERVER-2:08907] sess_dir_finalize: proc session dir not empty -
>>>>> leaving
>>>>> [sv-4:12040] sess_dir_finalize: job session dir not empty - leaving
>>>>> [SERVER-14:10755] sess_dir_finalize: job session dir not empty -
>>>>> leaving
>>>>> [SERVER-2:08907] sess_dir_finalize: proc session dir not empty -
>>>>> leaving
>>>>> [SERVER-6:11595] sess_dir_finalize: proc session dir not empty -
>>>>> leaving
>>>>> [SERVER-6:11595] sess_dir_finalize: proc session dir not empty -
>>>>> leaving
>>>>> [SERVER-4:27445] sess_dir_finalize: proc session dir not empty -
>>>>> leaving
>>>>> exiting with status 0
>>>>> [SERVER-4:27445] sess_dir_finalize: proc session dir not empty -
>>>>> leaving
>>>>> [SERVER-6:11595] sess_dir_finalize: job session dir not empty -
>>>>> leaving
>>>>> [SERVER-7:02607] sess_dir_finalize: proc session dir not empty -
>>>>> leaving
>>>>> [SERVER-7:02607] sess_dir_finalize: proc session dir not empty -
>>>>> leaving
>>>>> [SERVER-7:02607] sess_dir_finalize: job session dir not empty -
>>>>> leaving
>>>>> [SERVER-5:16404] sess_dir_finalize: proc session dir not empty -
>>>>> leaving
>>>>> [SERVER-5:16404] sess_dir_finalize: proc session dir not empty -
>>>>> leaving
>>>>> exiting with status 0
>>>>> exiting with status 0
>>>>> exiting with status 0
>>>>> [SERVER-4:27445] sess_dir_finalize: job session dir not empty -
>>>>> leaving
>>>>> exiting with status 0
>>>>> [SERVER-3:32517] sess_dir_finalize: proc session dir not empty -
>>>>> leaving
>>>>> [SERVER-3:32517] sess_dir_finalize: proc session dir not empty -
>>>>> leaving
>>>>> [sv-3:08575] sess_dir_finalize: proc session dir not empty - leaving
>>>>> [sv-3:08575] sess_dir_finalize: job session dir not empty - leaving
>>>>> exiting with status 0
>>>>> [sv-1:46100] sess_dir_finalize: proc session dir not empty - leaving
>>>>> [sv-1:46100] sess_dir_finalize: job session dir not empty - leaving
>>>>> exiting with status 0
>>>>> [sv-2:07725] sess_dir_finalize: proc session dir not empty - leaving
>>>>> [sv-2:07725] sess_dir_finalize: job session dir not empty - leaving
>>>>> exiting with status 0
>>>>> [SERVER-5:16404] sess_dir_finalize: job session dir not empty -
>>>>> leaving
>>>>> exiting with status 0
>>>>> [SERVER-3:32517] sess_dir_finalize: job session dir not empty -
>>>>> leaving
>>>>> exiting with status 0
>>>>> --------------------------------------------------------------------------
>>>>> mpirun has exited due to process rank 6 with PID 10768 on
>>>>> node x.x.x.41 exiting improperly. There are three reasons this
>>>>> could occur:
>>>>>
>>>>> 1. this process did not call "init" before exiting, but others in
>>>>> the job did. This can cause a job to hang indefinitely while it waits
>>>>> for all processes to call "init". By rule, if one process calls
>>>>> "init",
>>>>> then ALL processes must call "init" prior to termination.
>>>>>
>>>>> 2. this process called "init", but exited without calling "finalize".
>>>>> By rule, all processes that call "init" MUST call "finalize" prior to
>>>>> exiting or it will be considered an "abnormal termination"
>>>>>
>>>>> 3. this process called "MPI_Abort" or "orte_abort" and the mca
>>>>> parameter
>>>>> orte_create_session_dirs is set to false. In this case, the
>>>>> run-time cannot
>>>>> detect that the abort call was an abnormal termination. Hence, the
>>>>> only
>>>>> error message you will receive is this one.
>>>>>
>>>>> This may have caused other processes in the application to be
>>>>> terminated by signals sent by mpirun (as reported here).
>>>>>
>>>>> You can avoid this message by specifying -quiet on the mpirun
>>>>> command line.
>>>>>
>>>>> --------------------------------------------------------------------------
>>>>> [SERVER-2:08907] 6 more processes have sent help message
>>>>> help-mpi-runtime / mpi_init:startup:internal-failure
>>>>> [SERVER-2:08907] Set MCA parameter "orte_base_help_aggregate" to 0
>>>>> to see all help / error messages
>>>>> [SERVER-2:08907] 9 more processes have sent help message
>>>>> help-mpi-errors.txt / mpi_errors_are_fatal unknown handle
>>>>> [SERVER-2:08907] 9 more processes have sent help message
>>>>> help-mpi-runtime.txt / ompi mpi abort:cannot guarantee all killed
>>>>> [SERVER-2:08907] 2 more processes have sent help message
>>>>> help-mca-bml-r2.txt / unreachable proc
>>>>> [SERVER-2:08907] 2 more processes have sent help message
>>>>> help-mpi-runtime / mpi_init:startup:pml-add-procs-fail
>>>>> [SERVER-2:08907] sess_dir_finalize: job session dir not empty -
>>>>> leaving
>>>>> exiting with status 1
>>>>>
>>>>> //******************************************************************
>>>>>
>>>>> On 8/3/13 4:34 AM, Ralph Castain wrote:
>>>>>> It looks like SERVER-2 cannot talk to your x.x.x.100 machine. I
>>>>>> note that you have some entries at the end of the hostfile that I
>>>>>> don't understand - a list of hosts that can be reached? And I see
>>>>>> that your x.x.x.22 machine isn't on it. Is that SERVER-2 by chance?
>>>>>>
>>>>>> Our hostfile parsing changed between the release series, but I
>>>>>> know we never consciously supported the syntax you show below
>>>>>> where you list capabilities, and then re-list the hosts in an
>>>>>> apparent attempt to filter which ones can actually be used. It is
>>>>>> possible that the 1.5 series somehow used that to exclude the 22
>>>>>> machine, and that the 1.7 parser now doesn't do that.
>>>>>>
>>>>>> If you only include machines you actually intend to use in your
>>>>>> hostfile, does the 1.7 series work?
>>>>>>
>>>>>> On Aug 3, 2013, at 3:58 AM, RoboBeans <robobeans_at_[hidden]
>>>>>> <mailto:robobeans_at_[hidden]>> wrote:
>>>>>>
>>>>>>> Hello everyone,
>>>>>>>
>>>>>>> I have installed openmpi 1.5.4 on 11 node cluster using "yum
>>>>>>> install openmpi openmpi-devel" and everything seems to be
>>>>>>> working fine. For testing I am using this test program
>>>>>>>
>>>>>>> //******************************************************************
>>>>>>>
>>>>>>> *$ cat test.cpp*
>>>>>>>
>>>>>>> #include <stdio.h>
>>>>>>> #include <mpi.h>
>>>>>>>
>>>>>>> int main (int argc, char *argv[])
>>>>>>> {
>>>>>>> int id, np;
>>>>>>> char name[MPI_MAX_PROCESSOR_NAME];
>>>>>>> int namelen;
>>>>>>> int i;
>>>>>>>
>>>>>>> MPI_Init (&argc, &argv);
>>>>>>>
>>>>>>> MPI_Comm_size (MPI_COMM_WORLD, &np);
>>>>>>> MPI_Comm_rank (MPI_COMM_WORLD, &id);
>>>>>>> MPI_Get_processor_name (name, &namelen);
>>>>>>>
>>>>>>> printf ("This is Process %2d out of %2d running on host %s\n",
>>>>>>> id, np, name);
>>>>>>>
>>>>>>> MPI_Finalize ();
>>>>>>>
>>>>>>> return (0);
>>>>>>> }
>>>>>>>
>>>>>>> //******************************************************************
>>>>>>>
>>>>>>> and my hosts file look like this:
>>>>>>>
>>>>>>> *$ cat mpi_hostfile*
>>>>>>>
>>>>>>> # The Hostfile for Open MPI
>>>>>>>
>>>>>>> # specify number of slots for processes to run locally.
>>>>>>> #localhost slots=12
>>>>>>> #x.x.x.16 slots=12 max-slots=12
>>>>>>> #x.x.x.17 slots=12 max-slots=12
>>>>>>> #x.x.x.18 slots=12 max-slots=12
>>>>>>> #x.x.1x.19 slots=12 max-slots=12
>>>>>>> #x.x.x.20 slots=12 max-slots=12
>>>>>>> #x.x.x.55 slots=46 max-slots=46
>>>>>>> #x.x.x.56 slots=46 max-slots=46
>>>>>>>
>>>>>>> x.x.x.22 slots=15 max-slots=15
>>>>>>> x.x.x.24 slots=2 max-slots=2
>>>>>>> x.x.x.26 slots=14 max-slots=14
>>>>>>> x.x.x.28 slots=16 max-slots=16
>>>>>>> x.x.x.29 slots=14 max-slots=14
>>>>>>> x.x.x.30 slots=16 max-slots=16
>>>>>>> x.x.x.41 slots=46 max-slots=46
>>>>>>> x.x.x.101 slots=46 max-slots=46
>>>>>>> x.x.x.100 slots=46 max-slots=46
>>>>>>> x.x.x.102 slots=22 max-slots=22
>>>>>>> x.x.x.103 slots=22 max-slots=22
>>>>>>>
>>>>>>> # The following slave nodes are available to this machine:
>>>>>>> x.x.x.24
>>>>>>> x.x.x.26
>>>>>>> x.x.x.28
>>>>>>> x.x.x.29
>>>>>>> x.x.x.30
>>>>>>> x.x.x.41
>>>>>>> x.x.x.101
>>>>>>> x.x.x.100
>>>>>>> x.x.x.102
>>>>>>> x.x.x.103
>>>>>>>
>>>>>>> //******************************************************************
>>>>>>>
>>>>>>> this is how my .bashrc looks like on each node:
>>>>>>>
>>>>>>> *$ cat ~/.bashrc*
>>>>>>>
>>>>>>> # .bashrc
>>>>>>>
>>>>>>> # Source global definitions
>>>>>>> if [ -f /etc/bashrc ]; then
>>>>>>> . /etc/bashrc
>>>>>>> fi
>>>>>>>
>>>>>>> # User specific aliases and functions
>>>>>>> umask 077
>>>>>>>
>>>>>>> export PSM_SHAREDCONTEXTS_MAX=20
>>>>>>>
>>>>>>> #export PATH=/usr/lib64/openmpi/bin${PATH:+:$PATH}
>>>>>>> export PATH=/usr/OPENMPI/openmpi-1.7.2/bin${PATH:+:$PATH}
>>>>>>>
>>>>>>> #export
>>>>>>> LD_LIBRARY_PATH=/usr/lib64/openmpi/lib${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}
>>>>>>> export
>>>>>>> LD_LIBRARY_PATH=/usr/OPENMPI/openmpi-1.7.2/lib${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}
>>>>>>>
>>>>>>> export PATH="$PATH":/bin/:/usr/lib/:/usr/lib:/usr:/usr/
>>>>>>>
>>>>>>> //******************************************************************
>>>>>>>
>>>>>>> *$ mpic++ test.cpp -o test*
>>>>>>>
>>>>>>> *$ mpirun -d --display-map -np 10 --hostfile mpi_hostfile
>>>>>>> --bynode ./test*
>>>>>>>
>>>>>>> //******************************************************************
>>>>>>>
>>>>>>> These nodes are running 2.6.32-358.2.1.el6.x86_64 release
>>>>>>>
>>>>>>> *$ **uname*
>>>>>>> Linux
>>>>>>> *$ **uname -r*
>>>>>>> 2.6.32-358.2.1.el6.x86_64
>>>>>>> *$ cat /etc/issue*
>>>>>>> CentOS release 6.4 (Final)
>>>>>>> Kernel \r on an \m
>>>>>>>
>>>>>>> //******************************************************************
>>>>>>>
>>>>>>> Now, if I install openmpi 1.7.2 on each node separately then I
>>>>>>> can only use it on either first 7 nodes or last 4 nodes but not
>>>>>>> on all of them.
>>>>>>>
>>>>>>> //******************************************************************
>>>>>>>
>>>>>>> *$ gunzip -c openmpi-1.7.2.tar.gz | tar xf -**
>>>>>>> **
>>>>>>> **$ cd openmpi-1.7.2**
>>>>>>> ****
>>>>>>> **$ ./configure --prefix=/usr/OPENMPI/openmpi-1.7.2
>>>>>>> --enable-event-thread-support --enable-opal-multi-threads
>>>>>>> --enable-orte-progress-threads --enable-mpi-thread-multiple**
>>>>>>> **
>>>>>>> **$ make all install*
>>>>>>>
>>>>>>> //******************************************************************
>>>>>>>
>>>>>>> This is the error message that i am receiving:
>>>>>>>
>>>>>>>
>>>>>>> *$ mpirun -d --display-map -np 10 --hostfile mpi_hostfile
>>>>>>> --bynode ./test*
>>>>>>>
>>>>>>> [SERVER-2:05284] procdir:
>>>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-2_0/50535/0/0
>>>>>>> [SERVER-2:05284] jobdir:
>>>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-2_0/50535/0
>>>>>>> [SERVER-2:05284] top: openmpi-sessions-mpidemo_at_SERVER-2_0
>>>>>>> [SERVER-2:05284] tmp: /tmp
>>>>>>> CentOS release 6.4 (Final)
>>>>>>> Kernel \r on an \m
>>>>>>> CentOS release 6.4 (Final)
>>>>>>> Kernel \r on an \m
>>>>>>> CentOS release 6.4 (Final)
>>>>>>> Kernel \r on an \m
>>>>>>> [SERVER-3:28993] procdir:
>>>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-3_0/50535/0/1
>>>>>>> [SERVER-3:28993] jobdir:
>>>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-3_0/50535/0
>>>>>>> [SERVER-3:28993] top: openmpi-sessions-mpidemo_at_SERVER-3_0
>>>>>>> [SERVER-3:28993] tmp: /tmp
>>>>>>> CentOS release 6.4 (Final)
>>>>>>> Kernel \r on an \m
>>>>>>> CentOS release 6.4 (Final)
>>>>>>> Kernel \r on an \m
>>>>>>> [SERVER-6:09087] procdir:
>>>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-6_0/50535/0/4
>>>>>>> [SERVER-6:09087] jobdir:
>>>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-6_0/50535/0
>>>>>>> [SERVER-6:09087] top: openmpi-sessions-mpidemo_at_SERVER-6_0
>>>>>>> [SERVER-6:09087] tmp: /tmp
>>>>>>> [SERVER-7:32563] procdir:
>>>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-7_0/50535/0/5
>>>>>>> [SERVER-7:32563] jobdir:
>>>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-7_0/50535/0
>>>>>>> [SERVER-7:32563] top: openmpi-sessions-mpidemo_at_SERVER-7_0
>>>>>>> [SERVER-7:32563] tmp: /tmp
>>>>>>> [SERVER-4:15711] procdir:
>>>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-4_0/50535/0/2
>>>>>>> [SERVER-4:15711] jobdir:
>>>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-4_0/50535/0
>>>>>>> [SERVER-4:15711] top: openmpi-sessions-mpidemo_at_SERVER-4_0
>>>>>>> [SERVER-4:15711] tmp: /tmp
>>>>>>> [sv-1:45701] procdir: /tmp/openmpi-sessions-mpidemo_at_sv-1_0/50535/0/8
>>>>>>> [sv-1:45701] jobdir: /tmp/openmpi-sessions-mpidemo_at_sv-1_0/50535/0
>>>>>>> [sv-1:45701] top: openmpi-sessions-mpidemo_at_sv-1_0
>>>>>>> [sv-1:45701] tmp: /tmp
>>>>>>> CentOS release 6.4 (Final)
>>>>>>> Kernel \r on an \m
>>>>>>> [sv-3:08352] procdir: /tmp/openmpi-sessions-mpidemo_at_sv-3_0/50535/0/9
>>>>>>> [sv-3:08352] jobdir: /tmp/openmpi-sessions-mpidemo_at_sv-3_0/50535/0
>>>>>>> [sv-3:08352] top: openmpi-sessions-mpidemo_at_sv-3_0
>>>>>>> [sv-3:08352] tmp: /tmp
>>>>>>> [SERVER-5:12534] procdir:
>>>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-5_0/50535/0/3
>>>>>>> [SERVER-5:12534] jobdir:
>>>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-5_0/50535/0
>>>>>>> [SERVER-5:12534] top: openmpi-sessions-mpidemo_at_SERVER-5_0
>>>>>>> [SERVER-5:12534] tmp: /tmp
>>>>>>> [SERVER-14:08399] procdir:
>>>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-14_0/50535/0/6
>>>>>>> [SERVER-14:08399] jobdir:
>>>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-14_0/50535/0
>>>>>>> [SERVER-14:08399] top: openmpi-sessions-mpidemo_at_SERVER-14_0
>>>>>>> [SERVER-14:08399] tmp: /tmp
>>>>>>> [sv-4:11802] procdir:
>>>>>>> /tmp/openmpi-sessions-mpidemo_at_sv-4_0/50535/0/10
>>>>>>> [sv-4:11802] jobdir: /tmp/openmpi-sessions-mpidemo_at_sv-4_0/50535/0
>>>>>>> [sv-4:11802] top: openmpi-sessions-mpidemo_at_sv-4_0
>>>>>>> [sv-4:11802] tmp: /tmp
>>>>>>> [sv-2:07503] procdir: /tmp/openmpi-sessions-mpidemo_at_sv-2_0/50535/0/7
>>>>>>> [sv-2:07503] jobdir: /tmp/openmpi-sessions-mpidemo_at_sv-2_0/50535/0
>>>>>>> [sv-2:07503] top: openmpi-sessions-mpidemo_at_sv-2_0
>>>>>>> [sv-2:07503] tmp: /tmp
>>>>>>>
>>>>>>> Mapper requested: NULL Last mapper: round_robin Mapping
>>>>>>> policy: BYNODE Ranking policy: NODE Binding policy: NONE[NODE]
>>>>>>> Cpu set: NULL PPR: NULL
>>>>>>> Num new daemons: 0 New daemon starting vpid INVALID
>>>>>>> Num nodes: 10
>>>>>>>
>>>>>>> Data for node: SERVER-2 Launch id: -1 State: 2
>>>>>>> Daemon: [[50535,0],0] Daemon launched: True
>>>>>>> Num slots: 15 Slots in use: 1 Oversubscribed: FALSE
>>>>>>> Num slots allocated: 15 Max slots: 15
>>>>>>> Username on node: NULL
>>>>>>> Num procs: 1 Next node_rank: 1
>>>>>>> Data for proc: [[50535,1],0]
>>>>>>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 0
>>>>>>> State: INITIALIZED Restarts: 0 App_context: 0
>>>>>>> Locale: 0-15 Binding: NULL[0]
>>>>>>>
>>>>>>> Data for node: x.x.x.24 Launch id: -1 State: 0
>>>>>>> Daemon: [[50535,0],1] Daemon launched: False
>>>>>>> Num slots: 3 Slots in use: 1 Oversubscribed: FALSE
>>>>>>> Num slots allocated: 3 Max slots: 2
>>>>>>> Username on node: NULL
>>>>>>> Num procs: 1 Next node_rank: 1
>>>>>>> Data for proc: [[50535,1],1]
>>>>>>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 1
>>>>>>> State: INITIALIZED Restarts: 0 App_context: 0
>>>>>>> Locale: 0-7 Binding: NULL[0]
>>>>>>>
>>>>>>> Data for node: x.x.x.26 Launch id: -1 State: 0
>>>>>>> Daemon: [[50535,0],2] Daemon launched: False
>>>>>>> Num slots: 15 Slots in use: 1 Oversubscribed: FALSE
>>>>>>> Num slots allocated: 15 Max slots: 14
>>>>>>> Username on node: NULL
>>>>>>> Num procs: 1 Next node_rank: 1
>>>>>>> Data for proc: [[50535,1],2]
>>>>>>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 2
>>>>>>> State: INITIALIZED Restarts: 0 App_context: 0
>>>>>>> Locale: 0-7 Binding: NULL[0]
>>>>>>>
>>>>>>> Data for node: x.x.x.28 Launch id: -1 State: 0
>>>>>>> Daemon: [[50535,0],3] Daemon launched: False
>>>>>>> Num slots: 17 Slots in use: 1 Oversubscribed: FALSE
>>>>>>> Num slots allocated: 17 Max slots: 16
>>>>>>> Username on node: NULL
>>>>>>> Num procs: 1 Next node_rank: 1
>>>>>>> Data for proc: [[50535,1],3]
>>>>>>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 3
>>>>>>> State: INITIALIZED Restarts: 0 App_context: 0
>>>>>>> Locale: 0-7 Binding: NULL[0]
>>>>>>>
>>>>>>> Data for node: x.x.x.29 Launch id: -1 State: 0
>>>>>>> Daemon: [[50535,0],4] Daemon launched: False
>>>>>>> Num slots: 15 Slots in use: 1 Oversubscribed: FALSE
>>>>>>> Num slots allocated: 15 Max slots: 14
>>>>>>> Username on node: NULL
>>>>>>> Num procs: 1 Next node_rank: 1
>>>>>>> Data for proc: [[50535,1],4]
>>>>>>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 4
>>>>>>> State: INITIALIZED Restarts: 0 App_context: 0
>>>>>>> Locale: 0-7 Binding: NULL[0]
>>>>>>>
>>>>>>> Data for node: x.x.x.30 Launch id: -1 State: 0
>>>>>>> Daemon: [[50535,0],5] Daemon launched: False
>>>>>>> Num slots: 17 Slots in use: 1 Oversubscribed: FALSE
>>>>>>> Num slots allocated: 17 Max slots: 16
>>>>>>> Username on node: NULL
>>>>>>> Num procs: 1 Next node_rank: 1
>>>>>>> Data for proc: [[50535,1],5]
>>>>>>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 5
>>>>>>> State: INITIALIZED Restarts: 0 App_context: 0
>>>>>>> Locale: 0-7 Binding: NULL[0]
>>>>>>>
>>>>>>> Data for node: x.x.x.41 Launch id: -1 State: 0
>>>>>>> Daemon: [[50535,0],6] Daemon launched: False
>>>>>>> Num slots: 47 Slots in use: 1 Oversubscribed: FALSE
>>>>>>> Num slots allocated: 47 Max slots: 46
>>>>>>> Username on node: NULL
>>>>>>> Num procs: 1 Next node_rank: 1
>>>>>>> Data for proc: [[50535,1],6]
>>>>>>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 6
>>>>>>> State: INITIALIZED Restarts: 0 App_context: 0
>>>>>>> Locale: 0-7 Binding: NULL[0]
>>>>>>>
>>>>>>> Data for node: x.x.x.101 Launch id: -1 State: 0
>>>>>>> Daemon: [[50535,0],7] Daemon launched: False
>>>>>>> Num slots: 47 Slots in use: 1 Oversubscribed: FALSE
>>>>>>> Num slots allocated: 47 Max slots: 46
>>>>>>> Username on node: NULL
>>>>>>> Num procs: 1 Next node_rank: 1
>>>>>>> Data for proc: [[50535,1],7]
>>>>>>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 7
>>>>>>> State: INITIALIZED Restarts: 0 App_context: 0
>>>>>>> Locale: 0-7 Binding: NULL[0]
>>>>>>>
>>>>>>> Data for node: x.x.x.100 Launch id: -1 State: 0
>>>>>>> Daemon: [[50535,0],8] Daemon launched: False
>>>>>>> Num slots: 47 Slots in use: 1 Oversubscribed: FALSE
>>>>>>> Num slots allocated: 47 Max slots: 46
>>>>>>> Username on node: NULL
>>>>>>> Num procs: 1 Next node_rank: 1
>>>>>>> Data for proc: [[50535,1],8]
>>>>>>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 8
>>>>>>> State: INITIALIZED Restarts: 0 App_context: 0
>>>>>>> Locale: 0-7 Binding: NULL[0]
>>>>>>>
>>>>>>> Data for node: x.x.x.102 Launch id: -1 State: 0
>>>>>>> Daemon: [[50535,0],9] Daemon launched: False
>>>>>>> Num slots: 23 Slots in use: 1 Oversubscribed: FALSE
>>>>>>> Num slots allocated: 23 Max slots: 22
>>>>>>> Username on node: NULL
>>>>>>> Num procs: 1 Next node_rank: 1
>>>>>>> Data for proc: [[50535,1],9]
>>>>>>> Pid: 0 Local rank: 0 Node rank: 0 App rank: 9
>>>>>>> State: INITIALIZED Restarts: 0 App_context: 0
>>>>>>> Locale: 0-7 Binding: NULL[0]
>>>>>>> [sv-1:45712] procdir: /tmp/openmpi-sessions-mpidemo_at_sv-1_0/50535/1/8
>>>>>>> [sv-1:45712] jobdir: /tmp/openmpi-sessions-mpidemo_at_sv-1_0/50535/1
>>>>>>> [sv-1:45712] top: openmpi-sessions-mpidemo_at_sv-1_0
>>>>>>> [sv-1:45712] tmp: /tmp
>>>>>>> [SERVER-14:08412] procdir:
>>>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-14_0/50535/1/6
>>>>>>> [SERVER-14:08412] jobdir:
>>>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-14_0/50535/1
>>>>>>> [SERVER-14:08412] top: openmpi-sessions-mpidemo_at_SERVER-14_0
>>>>>>> [SERVER-14:08412] tmp: /tmp
>>>>>>> [SERVER-2:05291] procdir:
>>>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-2_0/50535/1/0
>>>>>>> [SERVER-2:05291] jobdir:
>>>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-2_0/50535/1
>>>>>>> [SERVER-2:05291] top: openmpi-sessions-mpidemo_at_SERVER-2_0
>>>>>>> [SERVER-2:05291] tmp: /tmp
>>>>>>> [SERVER-4:15726] procdir:
>>>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-4_0/50535/1/2
>>>>>>> [SERVER-4:15726] jobdir:
>>>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-4_0/50535/1
>>>>>>> [SERVER-4:15726] top: openmpi-sessions-mpidemo_at_SERVER-4_0
>>>>>>> [SERVER-4:15726] tmp: /tmp
>>>>>>> [SERVER-6:09100] procdir:
>>>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-6_0/50535/1/4
>>>>>>> [SERVER-6:09100] jobdir:
>>>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-6_0/50535/1
>>>>>>> [SERVER-6:09100] top: openmpi-sessions-mpidemo_at_SERVER-6_0
>>>>>>> [SERVER-6:09100] tmp: /tmp
>>>>>>> [SERVER-7:32576] procdir:
>>>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-7_0/50535/1/5
>>>>>>> [SERVER-7:32576] jobdir:
>>>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-7_0/50535/1
>>>>>>> [SERVER-7:32576] top: openmpi-sessions-mpidemo_at_SERVER-7_0
>>>>>>> [SERVER-7:32576] tmp: /tmp
>>>>>>> [sv-3:08363] procdir: /tmp/openmpi-sessions-mpidemo_at_sv-3_0/50535/1/9
>>>>>>> [sv-3:08363] jobdir: /tmp/openmpi-sessions-mpidemo_at_sv-3_0/50535/1
>>>>>>> [sv-3:08363] top: openmpi-sessions-mpidemo_at_sv-3_0
>>>>>>> [sv-3:08363] tmp: /tmp
>>>>>>> [sv-2:07514] procdir: /tmp/openmpi-sessions-mpidemo_at_sv-2_0/50535/1/7
>>>>>>> [sv-2:07514] jobdir: /tmp/openmpi-sessions-mpidemo_at_sv-2_0/50535/1
>>>>>>> [sv-2:07514] top: openmpi-sessions-mpidemo_at_sv-2_0
>>>>>>> [sv-2:07514] tmp: /tmp
>>>>>>> [SERVER-5:12548] procdir:
>>>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-5_0/50535/1/3
>>>>>>> [SERVER-5:12548] jobdir:
>>>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-5_0/50535/1
>>>>>>> [SERVER-5:12548] top: openmpi-sessions-mpidemo_at_SERVER-5_0
>>>>>>> [SERVER-5:12548] tmp: /tmp
>>>>>>> [SERVER-3:29009] procdir:
>>>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-3_0/50535/1/1
>>>>>>> [SERVER-3:29009] jobdir:
>>>>>>> /tmp/openmpi-sessions-mpidemo_at_SERVER-3_0/50535/1
>>>>>>> [SERVER-3:29009] top: openmpi-sessions-mpidemo_at_SERVER-3_0
>>>>>>> [SERVER-3:29009] tmp: /tmp
>>>>>>> MPIR_being_debugged = 0
>>>>>>> MPIR_debug_state = 1
>>>>>>> MPIR_partial_attach_ok = 1
>>>>>>> MPIR_i_am_starter = 0
>>>>>>> MPIR_forward_output = 0
>>>>>>> MPIR_proctable_size = 10
>>>>>>> MPIR_proctable:
>>>>>>> (i, host, exe, pid) = (0, SERVER-2,
>>>>>>> /usr2/mpidemo/dev/DISTRIBUTED_COMPUTING/./test, 5291)
>>>>>>> (i, host, exe, pid) = (1, x.x.x.24,
>>>>>>> /usr2/mpidemo/dev/DISTRIBUTED_COMPUTING/./test, 29009)
>>>>>>> (i, host, exe, pid) = (2, x.x.x.26,
>>>>>>> /usr2/mpidemo/dev/DISTRIBUTED_COMPUTING/./test, 15726)
>>>>>>> (i, host, exe, pid) = (3, x.x.x.28,
>>>>>>> /usr2/mpidemo/dev/DISTRIBUTED_COMPUTING/./test, 12548)
>>>>>>> (i, host, exe, pid) = (4, x.x.x.29,
>>>>>>> /usr2/mpidemo/dev/DISTRIBUTED_COMPUTING/./test, 9100)
>>>>>>> (i, host, exe, pid) = (5, x.x.x.30,
>>>>>>> /usr2/mpidemo/dev/DISTRIBUTED_COMPUTING/./test, 32576)
>>>>>>> (i, host, exe, pid) = (6, x.x.x.41,
>>>>>>> /usr2/mpidemo/dev/DISTRIBUTED_COMPUTING/./test, 8412)
>>>>>>> (i, host, exe, pid) = (7, x.x.x.101,
>>>>>>> /usr2/mpidemo/dev/DISTRIBUTED_COMPUTING/./test, 7514)
>>>>>>> (i, host, exe, pid) = (8, x.x.x.100,
>>>>>>> /usr2/mpidemo/dev/DISTRIBUTED_COMPUTING/./test, 45712)
>>>>>>> (i, host, exe, pid) = (9, x.x.x.102,
>>>>>>> /usr2/mpidemo/dev/DISTRIBUTED_COMPUTING/./test, 8363)
>>>>>>> MPIR_executable_path: NULL
>>>>>>> MPIR_server_arguments: NULL
>>>>>>> --------------------------------------------------------------------------
>>>>>>> It looks like MPI_INIT failed for some reason; your parallel
>>>>>>> process is
>>>>>>> likely to abort. There are many reasons that a parallel process can
>>>>>>> fail during MPI_INIT; some of which are due to configuration or
>>>>>>> environment
>>>>>>> problems. This failure appears to be an internal failure;
>>>>>>> here's some
>>>>>>> additional information (which may only be relevant to an Open MPI
>>>>>>> developer):
>>>>>>>
>>>>>>> PML add procs failed
>>>>>>> --> Returned "Error" (-1) instead of "Success" (0)
>>>>>>> --------------------------------------------------------------------------
>>>>>>> [SERVER-2:5291] *** An error occurred in MPI_Init
>>>>>>> [SERVER-2:5291] *** reported by process
>>>>>>> [140508871983105,140505560121344]
>>>>>>> [SERVER-2:5291] *** on a NULL communicator
>>>>>>> [SERVER-2:5291] *** Unknown error
>>>>>>> [SERVER-2:5291] *** MPI_ERRORS_ARE_FATAL (processes in this
>>>>>>> communicator will now abort,
>>>>>>> [SERVER-2:5291] *** and potentially your MPI job)
>>>>>>> --------------------------------------------------------------------------
>>>>>>> An MPI process is aborting at a time when it cannot guarantee
>>>>>>> that all
>>>>>>> of its peer processes in the job will be killed properly. You
>>>>>>> should
>>>>>>> double check that everything has shut down cleanly.
>>>>>>>
>>>>>>> Reason: Before MPI_INIT completed
>>>>>>> Local host: SERVER-2
>>>>>>> PID: 5291
>>>>>>> --------------------------------------------------------------------------
>>>>>>> [sv-1][[50535,1],8][btl_openib_proc.c:157:mca_btl_openib_proc_create]
>>>>>>> [btl_openib_proc.c:157] ompi_modex_recv failed for peer
>>>>>>> [[50535,1],0]
>>>>>>> [sv-3][[50535,1],9][btl_openib_proc.c:157:mca_btl_openib_proc_create]
>>>>>>> [btl_openib_proc.c:157] ompi_modex_recv failed for peer
>>>>>>> [[50535,1],0]
>>>>>>> [sv-3][[50535,1],9][btl_tcp_proc.c:128:mca_btl_tcp_proc_create]
>>>>>>> mca_base_modex_recv: failed with return value=-13
>>>>>>> [sv-3][[50535,1],9][btl_tcp_proc.c:128:mca_btl_tcp_proc_create]
>>>>>>> mca_base_modex_recv: failed with return value=-13
>>>>>>> [sv-1][[50535,1],8][btl_tcp_proc.c:128:mca_btl_tcp_proc_create]
>>>>>>> mca_base_modex_recv: failed with return value=-13
>>>>>>> [sv-1][[50535,1],8][btl_tcp_proc.c:128:mca_btl_tcp_proc_create]
>>>>>>> mca_base_modex_recv: failed with return value=-13
>>>>>>> --------------------------------------------------------------------------
>>>>>>> At least one pair of MPI processes are unable to reach each
>>>>>>> other for
>>>>>>> MPI communications. This means that no Open MPI device has
>>>>>>> indicated
>>>>>>> that it can be used to communicate between these processes. This is
>>>>>>> an error; Open MPI requires that all MPI processes be able to reach
>>>>>>> each other. This error can sometimes be the result of forgetting to
>>>>>>> specify the "self" BTL.
>>>>>>>
>>>>>>> Process 1 ([[50535,1],8]) is on host: sv-1
>>>>>>> Process 2 ([[50535,1],0]) is on host: SERVER-2
>>>>>>> BTLs attempted: openib self sm tcp
>>>>>>>
>>>>>>> Your MPI job is now going to abort; sorry.
>>>>>>> --------------------------------------------------------------------------
>>>>>>> --------------------------------------------------------------------------
>>>>>>> MPI_INIT has failed because at least one MPI process is unreachable
>>>>>>> from another. This *usually* means that an underlying communication
>>>>>>> plugin -- such as a BTL or an MTL -- has either not loaded or not
>>>>>>> allowed itself to be used. Your MPI job will now abort.
>>>>>>>
>>>>>>> You may wish to try to narrow down the problem;
>>>>>>>
>>>>>>> * Check the output of ompi_info to see which BTL/MTL plugins are
>>>>>>> available.
>>>>>>> * Run your application with MPI_THREAD_SINGLE.
>>>>>>> * Set the MCA parameter btl_base_verbose to 100 (or
>>>>>>> mtl_base_verbose,
>>>>>>> if using MTL-based communications) to see exactly which
>>>>>>> communication plugins were considered and/or discarded.
>>>>>>> --------------------------------------------------------------------------
>>>>>>> [sv-2][[50535,1],7][btl_openib_proc.c:157:mca_btl_openib_proc_create]
>>>>>>> [btl_openib_proc.c:157] ompi_modex_recv failed for peer
>>>>>>> [[50535,1],0]
>>>>>>> [sv-2][[50535,1],7][btl_tcp_proc.c:128:mca_btl_tcp_proc_create]
>>>>>>> mca_base_modex_recv: failed with return value=-13
>>>>>>> [sv-2][[50535,1],7][btl_tcp_proc.c:128:mca_btl_tcp_proc_create]
>>>>>>> mca_base_modex_recv: failed with return value=-13
>>>>>>> [SERVER-2:05284] sess_dir_finalize: proc session dir not empty -
>>>>>>> leaving
>>>>>>> [SERVER-2:05284] sess_dir_finalize: proc session dir not empty -
>>>>>>> leaving
>>>>>>> [sv-4:11802] sess_dir_finalize: job session dir not empty - leaving
>>>>>>> [SERVER-14:08399] sess_dir_finalize: job session dir not empty -
>>>>>>> leaving
>>>>>>> [SERVER-6:09087] sess_dir_finalize: proc session dir not empty -
>>>>>>> leaving
>>>>>>> [SERVER-6:09087] sess_dir_finalize: proc session dir not empty -
>>>>>>> leaving
>>>>>>> [SERVER-4:15711] sess_dir_finalize: proc session dir not empty -
>>>>>>> leaving
>>>>>>> [SERVER-4:15711] sess_dir_finalize: proc session dir not empty -
>>>>>>> leaving
>>>>>>> [SERVER-6:09087] sess_dir_finalize: job session dir not empty -
>>>>>>> leaving
>>>>>>> exiting with status 0
>>>>>>> [SERVER-7:32563] sess_dir_finalize: proc session dir not empty -
>>>>>>> leaving
>>>>>>> [SERVER-7:32563] sess_dir_finalize: proc session dir not empty -
>>>>>>> leaving
>>>>>>> [SERVER-5:12534] sess_dir_finalize: proc session dir not empty -
>>>>>>> leaving
>>>>>>> [SERVER-5:12534] sess_dir_finalize: proc session dir not empty -
>>>>>>> leaving
>>>>>>> [SERVER-7:32563] sess_dir_finalize: job session dir not empty -
>>>>>>> leaving
>>>>>>> exiting with status 0
>>>>>>> exiting with status 0
>>>>>>> exiting with status 0
>>>>>>> [SERVER-4:15711] sess_dir_finalize: job session dir not empty -
>>>>>>> leaving
>>>>>>> [SERVER-3:28993] sess_dir_finalize: proc session dir not empty -
>>>>>>> leaving
>>>>>>> exiting with status 0
>>>>>>> [SERVER-3:28993] sess_dir_finalize: proc session dir not empty -
>>>>>>> leaving
>>>>>>> [sv-3:08352] sess_dir_finalize: proc session dir not empty - leaving
>>>>>>> [sv-3:08352] sess_dir_finalize: job session dir not empty - leaving
>>>>>>> [sv-1:45701] sess_dir_finalize: proc session dir not empty - leaving
>>>>>>> [sv-1:45701] sess_dir_finalize: job session dir not empty - leaving
>>>>>>> exiting with status 0
>>>>>>> exiting with status 0
>>>>>>> [sv-2:07503] sess_dir_finalize: proc session dir not empty - leaving
>>>>>>> [sv-2:07503] sess_dir_finalize: job session dir not empty - leaving
>>>>>>> exiting with status 0
>>>>>>> [SERVER-5:12534] sess_dir_finalize: job session dir not empty -
>>>>>>> leaving
>>>>>>> exiting with status 0
>>>>>>> [SERVER-3:28993] sess_dir_finalize: job session dir not empty -
>>>>>>> leaving
>>>>>>> exiting with status 0
>>>>>>> --------------------------------------------------------------------------
>>>>>>> mpirun has exited due to process rank 6 with PID 8412 on
>>>>>>> node x.x.x.41 exiting improperly. There are three reasons this
>>>>>>> could occur:
>>>>>>>
>>>>>>> 1. this process did not call "init" before exiting, but others in
>>>>>>> the job did. This can cause a job to hang indefinitely while it
>>>>>>> waits
>>>>>>> for all processes to call "init". By rule, if one process calls
>>>>>>> "init",
>>>>>>> then ALL processes must call "init" prior to termination.
>>>>>>>
>>>>>>> 2. this process called "init", but exited without calling
>>>>>>> "finalize".
>>>>>>> By rule, all processes that call "init" MUST call "finalize"
>>>>>>> prior to
>>>>>>> exiting or it will be considered an "abnormal termination"
>>>>>>>
>>>>>>> 3. this process called "MPI_Abort" or "orte_abort" and the mca
>>>>>>> parameter
>>>>>>> orte_create_session_dirs is set to false. In this case, the
>>>>>>> run-time cannot
>>>>>>> detect that the abort call was an abnormal termination. Hence,
>>>>>>> the only
>>>>>>> error message you will receive is this one.
>>>>>>>
>>>>>>> This may have caused other processes in the application to be
>>>>>>> terminated by signals sent by mpirun (as reported here).
>>>>>>>
>>>>>>> You can avoid this message by specifying -quiet on the mpirun
>>>>>>> command line.
>>>>>>>
>>>>>>> --------------------------------------------------------------------------
>>>>>>> [SERVER-2:05284] 6 more processes have sent help message
>>>>>>> help-mpi-runtime / mpi_init:startup:internal-failure
>>>>>>> [SERVER-2:05284] Set MCA parameter "orte_base_help_aggregate" to
>>>>>>> 0 to see all help / error messages
>>>>>>> [SERVER-2:05284] 9 more processes have sent help message
>>>>>>> help-mpi-errors.txt / mpi_errors_are_fatal unknown handle
>>>>>>> [SERVER-2:05284] 9 more processes have sent help message
>>>>>>> help-mpi-runtime.txt / ompi mpi abort:cannot guarantee all killed
>>>>>>> [SERVER-2:05284] 2 more processes have sent help message
>>>>>>> help-mca-bml-r2.txt / unreachable proc
>>>>>>> [SERVER-2:05284] 2 more processes have sent help message
>>>>>>> help-mpi-runtime / mpi_init:startup:pml-add-procs-fail
>>>>>>> [SERVER-2:05284] sess_dir_finalize: job session dir not empty -
>>>>>>> leaving
>>>>>>> exiting with status 1
>>>>>>>
>>>>>>> //******************************************************************
>>>>>>>
>>>>>>> Any feedback will be helpful. Thank you!
>>>>>>>
>>>>>>> Mr. Beans
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> users_at_[hidden] <mailto:users_at_[hidden]>
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> users_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden] <mailto:users_at_[hidden]>
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden] <mailto:users_at_[hidden]>
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>