Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] OpenMPI.1.3.2 : PML add procs failed error while running with -mca btl openib, self, sm
From: Kartik (kartik.thathagar_at_[hidden])
Date: 2009-07-20 05:42:31


Hi,

We are running open MPI 1.3.2 with OFED1.5. we have 8 node cluster with
10Gb Iwarp ethernet card.
 
Node name are as below n130,n131,n132,n133,n134,n135,n136,n137.
Respective 10GB hostname are n130x,n131x..... n137x.
 
we have /root/mpd.hosts entry like as below:
 
n130x
n131x
n134x
n135x
n136x
n132x
n133x
n137x
 
We are not able to run open mpi with all 8 node.
 
mpirun -n 8 -np 8 -hostfile /root/mpd.hosts -mca btl openib,self,sm
--mca orte_base_help_aggregate 0 --mca btl_base_verbose 10 --mca
btl_openib_verbose 100 /usr/mpi/gcc/openmpi-1.3.2/tests/IMB-3.1/IMB-MPI1
Barrier
 
Output:
=================================================================================
 
At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is
an error; Open MPI requires that all MPI processes be able to reach
each other. This error can sometimes be the result of forgetting to
specify the "self" BTL.
 
  Process 1 ([[33322,1],0]) is on host: n130
  Process 2 ([[33322,1],5]) is on host: n132x
  BTLs attempted: openib self sm
 
Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is
an error; Open MPI requires that all MPI processes be able to reach
each other. This error can sometimes be the result of forgetting to
specify the "self" BTL.
 
  Process 1 ([[33322,1],2]) is on host: n134
  Process 2 ([[33322,1],5]) is on host: n132x
  BTLs attempted: openib self sm
 
Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is
an error; Open MPI requires that all MPI processes be able to reach
each other. This error can sometimes be the result of forgetting to
specify the "self" BTL.
 
  Process 1 ([[33322,1],5]) is on host: n132
  Process 2 ([[33322,1],0]) is on host: n130
  BTLs attempted: openib self sm
 
Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is
an error; Open MPI requires that all MPI processes be able to reach
each other. This error can sometimes be the result of forgetting to
specify the "self" BTL.
 
  Process 1 ([[33322,1],7]) is on host: n137
  Process 2 ([[33322,1],0]) is on host: n130
  BTLs attempted: openib self sm
 
Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is
an error; Open MPI requires that all MPI processes be able to reach
each other. This error can sometimes be the result of forgetting to
specify the "self" BTL.
 
  Process 1 ([[33322,1],3]) is on host: n135
  Process 2 ([[33322,1],5]) is on host: n132x
  BTLs attempted: openib self sm
 
Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is
an error; Open MPI requires that all MPI processes be able to reach
each other. This error can sometimes be the result of forgetting to
specify the "self" BTL.
 
  Process 1 ([[33322,1],6]) is on host: n133
  Process 2 ([[33322,1],0]) is on host: n130
  BTLs attempted: openib self sm
 
Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is
an error; Open MPI requires that all MPI processes be able to reach
each other. This error can sometimes be the result of forgetting to
specify the "self" BTL.
 
  Process 1 ([[33322,1],1]) is on host: n131
  Process 2 ([[33322,1],5]) is on host: n132x
  BTLs attempted: openib self sm
 
Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is
an error; Open MPI requires that all MPI processes be able to reach
each other. This error can sometimes be the result of forgetting to
specify the "self" BTL.
 
  Process 1 ([[33322,1],4]) is on host: n136
  Process 2 ([[33322,1],5]) is on host: n132x
  BTLs attempted: openib self sm
 
Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
 
  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
 
  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
 
  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
 
  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
 
  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
 
  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
 
  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init_thread
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[n134:4888] Abort before MPI_INIT completed successfully; not able to
guarantee that all other processes were killed!
*** An error occurred in MPI_Init_thread
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
*** An error occurred in MPI_Init_thread
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
*** An error occurred in MPI_Init_thread
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
*** An error occurred in MPI_Init_thread
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
*** An error occurred in MPI_Init_thread
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
*** An error occurred in MPI_Init_thread
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[n137:4890] Abort before MPI_INIT completed successfully; not able to
guarantee that all other processes were killed!
[n135:4883] Abort before MPI_INIT completed successfully; not able to
guarantee that all other processes were killed!
[n133:4850] Abort before MPI_INIT completed successfully; not able to
guarantee that all other processes were killed!
[n136:4866] Abort before MPI_INIT completed successfully; not able to
guarantee that all other processes were killed!
[n131:4866] Abort before MPI_INIT completed successfully; not able to
guarantee that all other processes were killed!
[n132:4855] Abort before MPI_INIT completed successfully; not able to
guarantee that all other processes were killed!
--------------------------------------------------------------------------
mpirun has exited due to process rank 3 with PID 4883 on
node n135x exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
 
  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init_thread
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[n130:4885] Abort before MPI_INIT completed successfully; not able to
guarantee that all other processes were killed!
=================================================================================
 
we are able to run same command on btl with tcp as below for all 8 node :
 
mpirun -n 8 -np 8 -hostfile /root/mpd.hosts -mca btl tcp,self,sm --mca
orte_base_help_aggregate 0 --mca btl_base_verbose 10 --mca
btl_openib_verbose 100 /usr/mpi/gcc/openmpi-1.3.2/tests/IMB-3.1/IMB-MPI1
Barrier
 
 
If we remove n132,n133,n137 node from mpd.hosts file then we are able to
run open mpi for all remaining 5 node on btl openib,sm,self .
 
So there is some problem with only n132,n133,n137 node. we are able to
run opnmpi with this 3 node. but when we try to run this node with other
5 node or one of the node (n130,n131,n134,n135,n136) then we will
get below error:
 
Output :
===============
At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is
an error; Open MPI requires that all MPI processes be able to reach
each other. This error can sometimes be the result of forgetting to
specify the "self" BTL.
 
  Process 1 ([[33304,1],1]) is on host: n132
  Process 2 ([[33304,1],0]) is on host: n130
  BTLs attempted: openib self sm
 
Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
 
  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is
an error; Open MPI requires that all MPI processes be able to reach
each other. This error can sometimes be the result of forgetting to
specify the "self" BTL.
 
  Process 1 ([[33304,1],0]) is on host: n130
  Process 2 ([[33304,1],1]) is on host: 100
  BTLs attempted: openib self sm
 
Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
 
  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init_thread
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[n130:4929] Abort before MPI_INIT completed successfully; not able to
guarantee that all other processes were killed!
*** An error occurred in MPI_Init_thread
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[n132:4963] Abort before MPI_INIT completed successfully; not able to
guarantee that all other processes were killed!
--------------------------------------------------------------------------
mpirun has exited due to process rank 0 with PID 4929 on
node n130 exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
-----------------------------------------------------------
 
we are able to run INtel,Mvapich2 MPI on All 8 node but we are facing
problem for OpenMPI. Can any one help us what the real issue with that 3
node.
 
Find attached Log for detail.
 
 
Thanks,
Hardik

------------------------------------------------------------------------

[root_at_n130 scripts]# mpirun -n 8 -np 8 -hostfile /root/mpd.hosts -mca btl openib,self,sm --mca orte_base_help_aggregate 0 --mca btl_base_verbose 10 --mca btl_openib_verbose 100 /opt/openmpi-1.3.2/NetEffect/test_bin/IMB_3.2/IMB-MPI1 Barrier
[n130:04885] mca: base: components_open: Looking for btl components
[n130:04885] mca: base: components_open: opening btl components
[n130:04885] mca: base: components_open: found loaded component openib
[n130:04885] mca: base: components_open: component openib has no register function
[n130:04885] mca: base: components_open: component openib open function successful
[n130:04885] mca: base: components_open: found loaded component self
[n130:04885] mca: base: components_open: component self has no register function
[n130:04885] mca: base: components_open: component self open function successful
[n130:04885] mca: base: components_open: found loaded component sm
[n130:04885] mca: base: components_open: component sm has no register function
[n130:04885] mca: base: components_open: component sm open function successful
[n134:04888] mca: base: components_open: Looking for btl components
[n136:04866] mca: base: components_open: Looking for btl components
[n130:04885] select: initializing btl component openib
[n131:04866] mca: base: components_open: Looking for btl components
[n134:04888] mca: base: components_open: opening btl components
[n134:04888] mca: base: components_open: found loaded component openib
[n134:04888] mca: base: components_open: component openib has no register function
[n136:04866] mca: base: components_open: opening btl components
[n136:04866] mca: base: components_open: found loaded component openib
[n136:04866] mca: base: components_open: component openib has no register function
[n134:04888] mca: base: components_open: component openib open function successful
[n134:04888] mca: base: components_open: found loaded component self
[n134:04888] mca: base: components_open: component self has no register function
[n134:04888] mca: base: components_open: component self open function successful
[n134:04888] mca: base: components_open: found loaded component sm
[n134:04888] mca: base: components_open: component sm has no register function
[n134:04888] mca: base: components_open: component sm open function successful
[n136:04866] mca: base: components_open: component openib open function successful
[n136:04866] mca: base: components_open: found loaded component self
[n136:04866] mca: base: components_open: component self has no register function
[n136:04866] mca: base: components_open: component self open function successful
[n136:04866] mca: base: components_open: found loaded component sm
[n136:04866] mca: base: components_open: component sm has no register function
[n136:04866] mca: base: components_open: component sm open function successful
[n132:04855] mca: base: components_open: Looking for btl components
[n133:04850] mca: base: components_open: Looking for btl components
[n130][[33322,1],0][btl_openib_ini.c:166:ompi_btl_openib_ini_query] Querying INI files for vendor 0x1255, part ID 256
[n130][[33322,1],0][btl_openib_ini.c:185:ompi_btl_openib_ini_query] Found corresponding INI values: NetEffect NE020
[n130][[33322,1],0][btl_openib_ini.c:166:ompi_btl_openib_ini_query] Querying INI files for vendor 0x0000, part ID 0
[n130][[33322,1],0][btl_openib_ini.c:185:ompi_btl_openib_ini_query] Found corresponding INI values: default
[n130:04885] openib BTL: oob CPC only supported on InfiniBand; skipped on device nes0
[n130:04885] openib BTL: xoob CPC only supported with XRC receive queues; skipped on device nes0
[n130:04885] openib BTL: rdmacm CPC available for use on nes0
[n130:04885] select: init of component openib returned success
[n130:04885] select: initializing btl component self
[n130:04885] select: init of component self returned success
[n130:04885] select: initializing btl component sm
[n130:04885] select: init of component sm returned success
[n135:04883] mca: base: components_open: Looking for btl components
[n131:04866] mca: base: components_open: opening btl components
[n131:04866] mca: base: components_open: found loaded component openib
[n131:04866] mca: base: components_open: component openib has no register function
[n131:04866] mca: base: components_open: component openib open function successful
[n131:04866] mca: base: components_open: found loaded component self
[n131:04866] mca: base: components_open: component self has no register function
[n131:04866] mca: base: components_open: component self open function successful
[n131:04866] mca: base: components_open: found loaded component sm
[n131:04866] mca: base: components_open: component sm has no register function
[n131:04866] mca: base: components_open: component sm open function successful
[n134:04888] select: initializing btl component openib
[n136:04866] select: initializing btl component openib
[n131:04866] select: initializing btl component openib
[n132:04855] mca: base: components_open: opening btl components
[n132:04855] mca: base: components_open: found loaded component openib
[n132:04855] mca: base: components_open: component openib has no register function
[n132:04855] mca: base: components_open: component openib open function successful
[n132:04855] mca: base: components_open: found loaded component self
[n132:04855] mca: base: components_open: component self has no register function
[n132:04855] mca: base: components_open: component self open function successful
[n132:04855] mca: base: components_open: found loaded component sm
[n132:04855] mca: base: components_open: component sm has no register function
[n132:04855] mca: base: components_open: component sm open function successful
[n133:04850] mca: base: components_open: opening btl components
[n133:04850] mca: base: components_open: found loaded component openib
[n133:04850] mca: base: components_open: component openib has no register function
[n133:04850] mca: base: components_open: component openib open function successful
[n133:04850] mca: base: components_open: found loaded component self
[n133:04850] mca: base: components_open: component self has no register function
[n133:04850] mca: base: components_open: component self open function successful
[n133:04850] mca: base: components_open: found loaded component sm
[n133:04850] mca: base: components_open: component sm has no register function
[n133:04850] mca: base: components_open: component sm open function successful
[n136][[33322,1],4][btl_openib_ini.c:166:ompi_btl_openib_ini_query] Querying INI files for vendor 0x1255, part ID 256
[n136][[33322,1],4][btl_openib_ini.c:185:ompi_btl_openib_ini_query] Found corresponding INI values: NetEffect NE020
[n136][[33322,1],4][btl_openib_ini.c:166:ompi_btl_openib_ini_query] Querying INI files for vendor 0x0000, part ID 0
[n136][[33322,1],4][btl_openib_ini.c:185:ompi_btl_openib_ini_query] Found corresponding INI values: default
[n136:04866] openib BTL: oob CPC only supported on InfiniBand; skipped on device nes0
[n136:04866] openib BTL: xoob CPC only supported with XRC receive queues; skipped on device nes0
[n136:04866] openib BTL: rdmacm CPC available for use on nes0
[n136:04866] select: init of component openib returned success
[n136:04866] select: initializing btl component self
[n136:04866] select: init of component self returned success
[n136:04866] select: initializing btl component sm
[n136:04866] select: init of component sm returned success
[n135:04883] mca: base: components_open: opening btl components
[n135:04883] mca: base: components_open: found loaded component openib
[n135:04883] mca: base: components_open: component openib has no register function
[n135:04883] mca: base: components_open: component openib open function successful
[n135:04883] mca: base: components_open: found loaded component self
[n135:04883] mca: base: components_open: component self has no register function
[n135:04883] mca: base: components_open: component self open function successful
[n135:04883] mca: base: components_open: found loaded component sm
[n135:04883] mca: base: components_open: component sm has no register function
[n135:04883] mca: base: components_open: component sm open function successful
[n137:04890] mca: base: components_open: Looking for btl components
[n134][[33322,1],2][btl_openib_ini.c:166:ompi_btl_openib_ini_query] Querying INI files for vendor 0x1255, part ID 256
[n134][[33322,1],2][btl_openib_ini.c:185:ompi_btl_openib_ini_query] Found corresponding INI values: NetEffect NE020
[n134][[33322,1],2][btl_openib_ini.c:166:ompi_btl_openib_ini_query] Querying INI files for vendor 0x0000, part ID 0
[n134][[33322,1],2][btl_openib_ini.c:185:ompi_btl_openib_ini_query] Found corresponding INI values: default
[n134:04888] openib BTL: oob CPC only supported on InfiniBand; skipped on device nes0
[n134:04888] openib BTL: xoob CPC only supported with XRC receive queues; skipped on device nes0
[n134:04888] openib BTL: rdmacm CPC available for use on nes0
[n134:04888] select: init of component openib returned success
[n134:04888] select: initializing btl component self
[n134:04888] select: init of component self returned success
[n134:04888] select: initializing btl component sm
[n134:04888] select: init of component sm returned success
[n132:04855] select: initializing btl component openib
[n131][[33322,1],1][btl_openib_ini.c:166:ompi_btl_openib_ini_query] Querying INI files for vendor 0x1255, part ID 256
[n131][[33322,1],1][btl_openib_ini.c:185:ompi_btl_openib_ini_query] Found corresponding INI values: NetEffect NE020
[n131][[33322,1],1][btl_openib_ini.c:166:ompi_btl_openib_ini_query] Querying INI files for vendor 0x0000, part ID 0
[n131][[33322,1],1][btl_openib_ini.c:185:ompi_btl_openib_ini_query] Found corresponding INI values: default
[n131:04866] openib BTL: oob CPC only supported on InfiniBand; skipped on device nes0
[n131:04866] openib BTL: xoob CPC only supported with XRC receive queues; skipped on device nes0
[n131:04866] openib BTL: rdmacm CPC available for use on nes0
[n131:04866] select: init of component openib returned success
[n131:04866] select: initializing btl component self
[n131:04866] select: init of component self returned success
[n131:04866] select: initializing btl component sm
[n131:04866] select: init of component sm returned success
[n135:04883] select: initializing btl component openib
[n133:04850] select: initializing btl component openib
[n132][[33322,1],5][btl_openib_ini.c:166:ompi_btl_openib_ini_query] Querying INI files for vendor 0x1255, part ID 256
[n132][[33322,1],5][btl_openib_ini.c:185:ompi_btl_openib_ini_query] Found corresponding INI values: NetEffect NE020
[n132][[33322,1],5][btl_openib_ini.c:166:ompi_btl_openib_ini_query] Querying INI files for vendor 0x0000, part ID 0
[n132][[33322,1],5][btl_openib_ini.c:185:ompi_btl_openib_ini_query] Found corresponding INI values: default
[n132:04855] openib BTL: oob CPC only supported on InfiniBand; skipped on device nes0
[n132:04855] openib BTL: xoob CPC only supported with XRC receive queues; skipped on device nes0
[n132:04855] openib BTL: rdmacm CPC available for use on nes0
[n132:04855] select: init of component openib returned success
[n132:04855] select: initializing btl component self
[n132:04855] select: init of component self returned success
[n132:04855] select: initializing btl component sm
[n132:04855] select: init of component sm returned success
[n137:04890] mca: base: components_open: opening btl components
[n137:04890] mca: base: components_open: found loaded component openib
[n137:04890] mca: base: components_open: component openib has no register function
[n137:04890] mca: base: components_open: component openib open function successful
[n137:04890] mca: base: components_open: found loaded component self
[n137:04890] mca: base: components_open: component self has no register function
[n137:04890] mca: base: components_open: component self open function successful
[n137:04890] mca: base: components_open: found loaded component sm
[n137:04890] mca: base: components_open: component sm has no register function
[n137:04890] mca: base: components_open: component sm open function successful
[n135][[33322,1],3][btl_openib_ini.c:166:ompi_btl_openib_ini_query] Querying INI files for vendor 0x1255, part ID 256
[n135][[33322,1],3][btl_openib_ini.c:185:ompi_btl_openib_ini_query] Found corresponding INI values: NetEffect NE020
[n135][[33322,1],3][btl_openib_ini.c:166:ompi_btl_openib_ini_query] Querying INI files for vendor 0x0000, part ID 0
[n135][[33322,1],3][btl_openib_ini.c:185:ompi_btl_openib_ini_query] Found corresponding INI values: default
[n133][[33322,1],6][btl_openib_ini.c:166:ompi_btl_openib_ini_query] Querying INI files for vendor 0x1255, part ID 256
[n133][[33322,1],6][btl_openib_ini.c:185:ompi_btl_openib_ini_query] Found corresponding INI values: NetEffect NE020
[n133][[33322,1],6][btl_openib_ini.c:166:ompi_btl_openib_ini_query] Querying INI files for vendor 0x0000, part ID 0
[n133][[33322,1],6][btl_openib_ini.c:185:ompi_btl_openib_ini_query] Found corresponding INI values: default
[n135:04883] openib BTL: oob CPC only supported on InfiniBand; skipped on device nes0
[n135:04883] openib BTL: xoob CPC only supported with XRC receive queues; skipped on device nes0
[n135:04883] openib BTL: rdmacm CPC available for use on nes0
[n135:04883] select: init of component openib returned success
[n135:04883] select: initializing btl component self
[n135:04883] select: init of component self returned success
[n135:04883] select: initializing btl component sm
[n135:04883] select: init of component sm returned success
[n133:04850] openib BTL: oob CPC only supported on InfiniBand; skipped on device nes0
[n133:04850] openib BTL: xoob CPC only supported with XRC receive queues; skipped on device nes0
[n133:04850] openib BTL: rdmacm CPC available for use on nes0
[n133:04850] select: init of component openib returned success
[n133:04850] select: initializing btl component self
[n133:04850] select: init of component self returned success
[n133:04850] select: initializing btl component sm
[n133:04850] select: init of component sm returned success
[n137:04890] select: initializing btl component openib
[n137][[33322,1],7][btl_openib_ini.c:166:ompi_btl_openib_ini_query] Querying INI files for vendor 0x1255, part ID 256
[n137][[33322,1],7][btl_openib_ini.c:185:ompi_btl_openib_ini_query] Found corresponding INI values: NetEffect NE020
[n137][[33322,1],7][btl_openib_ini.c:166:ompi_btl_openib_ini_query] Querying INI files for vendor 0x0000, part ID 0
[n137][[33322,1],7][btl_openib_ini.c:185:ompi_btl_openib_ini_query] Found corresponding INI values: default
[n137:04890] openib BTL: oob CPC only supported on InfiniBand; skipped on device nes0
[n137:04890] openib BTL: xoob CPC only supported with XRC receive queues; skipped on device nes0
[n137:04890] openib BTL: rdmacm CPC available for use on nes0
[n137:04890] select: init of component openib returned success
[n137:04890] select: initializing btl component self
[n137:04890] select: init of component self returned success
[n137:04890] select: initializing btl component sm
[n137:04890] select: init of component sm returned success
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is
an error; Open MPI requires that all MPI processes be able to reach
each other. This error can sometimes be the result of forgetting to
specify the "self" BTL.

  Process 1 ([[33322,1],0]) is on host: n130
  Process 2 ([[33322,1],5]) is on host: n132x
  BTLs attempted: openib self sm

Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is
an error; Open MPI requires that all MPI processes be able to reach
each other. This error can sometimes be the result of forgetting to
specify the "self" BTL.

  Process 1 ([[33322,1],2]) is on host: n134
  Process 2 ([[33322,1],5]) is on host: n132x
  BTLs attempted: openib self sm

Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is
an error; Open MPI requires that all MPI processes be able to reach
each other. This error can sometimes be the result of forgetting to
specify the "self" BTL.

  Process 1 ([[33322,1],5]) is on host: n132
  Process 2 ([[33322,1],0]) is on host: n130
  BTLs attempted: openib self sm

Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is
an error; Open MPI requires that all MPI processes be able to reach
each other. This error can sometimes be the result of forgetting to
specify the "self" BTL.

  Process 1 ([[33322,1],7]) is on host: n137
  Process 2 ([[33322,1],0]) is on host: n130
  BTLs attempted: openib self sm

Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is
an error; Open MPI requires that all MPI processes be able to reach
each other. This error can sometimes be the result of forgetting to
specify the "self" BTL.

  Process 1 ([[33322,1],3]) is on host: n135
  Process 2 ([[33322,1],5]) is on host: n132x
  BTLs attempted: openib self sm

Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is
an error; Open MPI requires that all MPI processes be able to reach
each other. This error can sometimes be the result of forgetting to
specify the "self" BTL.

  Process 1 ([[33322,1],6]) is on host: n133
  Process 2 ([[33322,1],0]) is on host: n130
  BTLs attempted: openib self sm

Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is
an error; Open MPI requires that all MPI processes be able to reach
each other. This error can sometimes be the result of forgetting to
specify the "self" BTL.

  Process 1 ([[33322,1],1]) is on host: n131
  Process 2 ([[33322,1],5]) is on host: n132x
  BTLs attempted: openib self sm

Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is
an error; Open MPI requires that all MPI processes be able to reach
each other. This error can sometimes be the result of forgetting to
specify the "self" BTL.

  Process 1 ([[33322,1],4]) is on host: n136
  Process 2 ([[33322,1],5]) is on host: n132x
  BTLs attempted: openib self sm

Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init_thread
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[n134:4888] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init_thread
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
*** An error occurred in MPI_Init_thread
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
*** An error occurred in MPI_Init_thread
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
*** An error occurred in MPI_Init_thread
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
*** An error occurred in MPI_Init_thread
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
*** An error occurred in MPI_Init_thread
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[n137:4890] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
[n135:4883] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
[n133:4850] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
[n136:4866] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
[n131:4866] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
[n132:4855] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
--------------------------------------------------------------------------
mpirun has exited due to process rank 3 with PID 4883 on
node n135x exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init_thread
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[n130:4885] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
[root_at_n130 scripts]#

Thanks,
Kartik


[root_at_n130 scripts]# mpirun -n 8 -np 8 -hostfile /root/mpd.hosts -mca btl openib,self,sm --mca orte_base_help_aggregate 0 --mca btl_base_verbose 10 --mca btl_openib_verbose 100 /opt/openmpi-1.3.2/NetEffect/test_bin/IMB_3.2/IMB-MPI1 Barrier
[n130:04885] mca: base: components_open: Looking for btl components
[n130:04885] mca: base: components_open: opening btl components
[n130:04885] mca: base: components_open: found loaded component openib
[n130:04885] mca: base: components_open: component openib has no register function
[n130:04885] mca: base: components_open: component openib open function successful
[n130:04885] mca: base: components_open: found loaded component self
[n130:04885] mca: base: components_open: component self has no register function
[n130:04885] mca: base: components_open: component self open function successful
[n130:04885] mca: base: components_open: found loaded component sm
[n130:04885] mca: base: components_open: component sm has no register function
[n130:04885] mca: base: components_open: component sm open function successful
[n134:04888] mca: base: components_open: Looking for btl components
[n136:04866] mca: base: components_open: Looking for btl components
[n130:04885] select: initializing btl component openib
[n131:04866] mca: base: components_open: Looking for btl components
[n134:04888] mca: base: components_open: opening btl components
[n134:04888] mca: base: components_open: found loaded component openib
[n134:04888] mca: base: components_open: component openib has no register function
[n136:04866] mca: base: components_open: opening btl components
[n136:04866] mca: base: components_open: found loaded component openib
[n136:04866] mca: base: components_open: component openib has no register function
[n134:04888] mca: base: components_open: component openib open function successful
[n134:04888] mca: base: components_open: found loaded component self
[n134:04888] mca: base: components_open: component self has no register function
[n134:04888] mca: base: components_open: component self open function successful
[n134:04888] mca: base: components_open: found loaded component sm
[n134:04888] mca: base: components_open: component sm has no register function
[n134:04888] mca: base: components_open: component sm open function successful
[n136:04866] mca: base: components_open: component openib open function successful
[n136:04866] mca: base: components_open: found loaded component self
[n136:04866] mca: base: components_open: component self has no register function
[n136:04866] mca: base: components_open: component self open function successful
[n136:04866] mca: base: components_open: found loaded component sm
[n136:04866] mca: base: components_open: component sm has no register function
[n136:04866] mca: base: components_open: component sm open function successful
[n132:04855] mca: base: components_open: Looking for btl components
[n133:04850] mca: base: components_open: Looking for btl components
[n130][[33322,1],0][btl_openib_ini.c:166:ompi_btl_openib_ini_query] Querying INI files for vendor 0x1255, part ID 256
[n130][[33322,1],0][btl_openib_ini.c:185:ompi_btl_openib_ini_query] Found corresponding INI values: NetEffect NE020
[n130][[33322,1],0][btl_openib_ini.c:166:ompi_btl_openib_ini_query] Querying INI files for vendor 0x0000, part ID 0
[n130][[33322,1],0][btl_openib_ini.c:185:ompi_btl_openib_ini_query] Found corresponding INI values: default
[n130:04885] openib BTL: oob CPC only supported on InfiniBand; skipped on device nes0
[n130:04885] openib BTL: xoob CPC only supported with XRC receive queues; skipped on device nes0
[n130:04885] openib BTL: rdmacm CPC available for use on nes0
[n130:04885] select: init of component openib returned success
[n130:04885] select: initializing btl component self
[n130:04885] select: init of component self returned success
[n130:04885] select: initializing btl component sm
[n130:04885] select: init of component sm returned success
[n135:04883] mca: base: components_open: Looking for btl components
[n131:04866] mca: base: components_open: opening btl components
[n131:04866] mca: base: components_open: found loaded component openib
[n131:04866] mca: base: components_open: component openib has no register function
[n131:04866] mca: base: components_open: component openib open function successful
[n131:04866] mca: base: components_open: found loaded component self
[n131:04866] mca: base: components_open: component self has no register function
[n131:04866] mca: base: components_open: component self open function successful
[n131:04866] mca: base: components_open: found loaded component sm
[n131:04866] mca: base: components_open: component sm has no register function
[n131:04866] mca: base: components_open: component sm open function successful
[n134:04888] select: initializing btl component openib
[n136:04866] select: initializing btl component openib
[n131:04866] select: initializing btl component openib
[n132:04855] mca: base: components_open: opening btl components
[n132:04855] mca: base: components_open: found loaded component openib
[n132:04855] mca: base: components_open: component openib has no register function
[n132:04855] mca: base: components_open: component openib open function successful
[n132:04855] mca: base: components_open: found loaded component self
[n132:04855] mca: base: components_open: component self has no register function
[n132:04855] mca: base: components_open: component self open function successful
[n132:04855] mca: base: components_open: found loaded component sm
[n132:04855] mca: base: components_open: component sm has no register function
[n132:04855] mca: base: components_open: component sm open function successful
[n133:04850] mca: base: components_open: opening btl components
[n133:04850] mca: base: components_open: found loaded component openib
[n133:04850] mca: base: components_open: component openib has no register function
[n133:04850] mca: base: components_open: component openib open function successful
[n133:04850] mca: base: components_open: found loaded component self
[n133:04850] mca: base: components_open: component self has no register function
[n133:04850] mca: base: components_open: component self open function successful
[n133:04850] mca: base: components_open: found loaded component sm
[n133:04850] mca: base: components_open: component sm has no register function
[n133:04850] mca: base: components_open: component sm open function successful
[n136][[33322,1],4][btl_openib_ini.c:166:ompi_btl_openib_ini_query] Querying INI files for vendor 0x1255, part ID 256
[n136][[33322,1],4][btl_openib_ini.c:185:ompi_btl_openib_ini_query] Found corresponding INI values: NetEffect NE020
[n136][[33322,1],4][btl_openib_ini.c:166:ompi_btl_openib_ini_query] Querying INI files for vendor 0x0000, part ID 0
[n136][[33322,1],4][btl_openib_ini.c:185:ompi_btl_openib_ini_query] Found corresponding INI values: default
[n136:04866] openib BTL: oob CPC only supported on InfiniBand; skipped on device nes0
[n136:04866] openib BTL: xoob CPC only supported with XRC receive queues; skipped on device nes0
[n136:04866] openib BTL: rdmacm CPC available for use on nes0
[n136:04866] select: init of component openib returned success
[n136:04866] select: initializing btl component self
[n136:04866] select: init of component self returned success
[n136:04866] select: initializing btl component sm
[n136:04866] select: init of component sm returned success
[n135:04883] mca: base: components_open: opening btl components
[n135:04883] mca: base: components_open: found loaded component openib
[n135:04883] mca: base: components_open: component openib has no register function
[n135:04883] mca: base: components_open: component openib open function successful
[n135:04883] mca: base: components_open: found loaded component self
[n135:04883] mca: base: components_open: component self has no register function
[n135:04883] mca: base: components_open: component self open function successful
[n135:04883] mca: base: components_open: found loaded component sm
[n135:04883] mca: base: components_open: component sm has no register function
[n135:04883] mca: base: components_open: component sm open function successful
[n137:04890] mca: base: components_open: Looking for btl components
[n134][[33322,1],2][btl_openib_ini.c:166:ompi_btl_openib_ini_query] Querying INI files for vendor 0x1255, part ID 256
[n134][[33322,1],2][btl_openib_ini.c:185:ompi_btl_openib_ini_query] Found corresponding INI values: NetEffect NE020
[n134][[33322,1],2][btl_openib_ini.c:166:ompi_btl_openib_ini_query] Querying INI files for vendor 0x0000, part ID 0
[n134][[33322,1],2][btl_openib_ini.c:185:ompi_btl_openib_ini_query] Found corresponding INI values: default
[n134:04888] openib BTL: oob CPC only supported on InfiniBand; skipped on device nes0
[n134:04888] openib BTL: xoob CPC only supported with XRC receive queues; skipped on device nes0
[n134:04888] openib BTL: rdmacm CPC available for use on nes0
[n134:04888] select: init of component openib returned success
[n134:04888] select: initializing btl component self
[n134:04888] select: init of component self returned success
[n134:04888] select: initializing btl component sm
[n134:04888] select: init of component sm returned success
[n132:04855] select: initializing btl component openib
[n131][[33322,1],1][btl_openib_ini.c:166:ompi_btl_openib_ini_query] Querying INI files for vendor 0x1255, part ID 256
[n131][[33322,1],1][btl_openib_ini.c:185:ompi_btl_openib_ini_query] Found corresponding INI values: NetEffect NE020
[n131][[33322,1],1][btl_openib_ini.c:166:ompi_btl_openib_ini_query] Querying INI files for vendor 0x0000, part ID 0
[n131][[33322,1],1][btl_openib_ini.c:185:ompi_btl_openib_ini_query] Found corresponding INI values: default
[n131:04866] openib BTL: oob CPC only supported on InfiniBand; skipped on device nes0
[n131:04866] openib BTL: xoob CPC only supported with XRC receive queues; skipped on device nes0
[n131:04866] openib BTL: rdmacm CPC available for use on nes0
[n131:04866] select: init of component openib returned success
[n131:04866] select: initializing btl component self
[n131:04866] select: init of component self returned success
[n131:04866] select: initializing btl component sm
[n131:04866] select: init of component sm returned success
[n135:04883] select: initializing btl component openib
[n133:04850] select: initializing btl component openib
[n132][[33322,1],5][btl_openib_ini.c:166:ompi_btl_openib_ini_query] Querying INI files for vendor 0x1255, part ID 256
[n132][[33322,1],5][btl_openib_ini.c:185:ompi_btl_openib_ini_query] Found corresponding INI values: NetEffect NE020
[n132][[33322,1],5][btl_openib_ini.c:166:ompi_btl_openib_ini_query] Querying INI files for vendor 0x0000, part ID 0
[n132][[33322,1],5][btl_openib_ini.c:185:ompi_btl_openib_ini_query] Found corresponding INI values: default
[n132:04855] openib BTL: oob CPC only supported on InfiniBand; skipped on device nes0
[n132:04855] openib BTL: xoob CPC only supported with XRC receive queues; skipped on device nes0
[n132:04855] openib BTL: rdmacm CPC available for use on nes0
[n132:04855] select: init of component openib returned success
[n132:04855] select: initializing btl component self
[n132:04855] select: init of component self returned success
[n132:04855] select: initializing btl component sm
[n132:04855] select: init of component sm returned success
[n137:04890] mca: base: components_open: opening btl components
[n137:04890] mca: base: components_open: found loaded component openib
[n137:04890] mca: base: components_open: component openib has no register function
[n137:04890] mca: base: components_open: component openib open function successful
[n137:04890] mca: base: components_open: found loaded component self
[n137:04890] mca: base: components_open: component self has no register function
[n137:04890] mca: base: components_open: component self open function successful
[n137:04890] mca: base: components_open: found loaded component sm
[n137:04890] mca: base: components_open: component sm has no register function
[n137:04890] mca: base: components_open: component sm open function successful
[n135][[33322,1],3][btl_openib_ini.c:166:ompi_btl_openib_ini_query] Querying INI files for vendor 0x1255, part ID 256
[n135][[33322,1],3][btl_openib_ini.c:185:ompi_btl_openib_ini_query] Found corresponding INI values: NetEffect NE020
[n135][[33322,1],3][btl_openib_ini.c:166:ompi_btl_openib_ini_query] Querying INI files for vendor 0x0000, part ID 0
[n135][[33322,1],3][btl_openib_ini.c:185:ompi_btl_openib_ini_query] Found corresponding INI values: default
[n133][[33322,1],6][btl_openib_ini.c:166:ompi_btl_openib_ini_query] Querying INI files for vendor 0x1255, part ID 256
[n133][[33322,1],6][btl_openib_ini.c:185:ompi_btl_openib_ini_query] Found corresponding INI values: NetEffect NE020
[n133][[33322,1],6][btl_openib_ini.c:166:ompi_btl_openib_ini_query] Querying INI files for vendor 0x0000, part ID 0
[n133][[33322,1],6][btl_openib_ini.c:185:ompi_btl_openib_ini_query] Found corresponding INI values: default
[n135:04883] openib BTL: oob CPC only supported on InfiniBand; skipped on device nes0
[n135:04883] openib BTL: xoob CPC only supported with XRC receive queues; skipped on device nes0
[n135:04883] openib BTL: rdmacm CPC available for use on nes0
[n135:04883] select: init of component openib returned success
[n135:04883] select: initializing btl component self
[n135:04883] select: init of component self returned success
[n135:04883] select: initializing btl component sm
[n135:04883] select: init of component sm returned success
[n133:04850] openib BTL: oob CPC only supported on InfiniBand; skipped on device nes0
[n133:04850] openib BTL: xoob CPC only supported with XRC receive queues; skipped on device nes0
[n133:04850] openib BTL: rdmacm CPC available for use on nes0
[n133:04850] select: init of component openib returned success
[n133:04850] select: initializing btl component self
[n133:04850] select: init of component self returned success
[n133:04850] select: initializing btl component sm
[n133:04850] select: init of component sm returned success
[n137:04890] select: initializing btl component openib
[n137][[33322,1],7][btl_openib_ini.c:166:ompi_btl_openib_ini_query] Querying INI files for vendor 0x1255, part ID 256
[n137][[33322,1],7][btl_openib_ini.c:185:ompi_btl_openib_ini_query] Found corresponding INI values: NetEffect NE020
[n137][[33322,1],7][btl_openib_ini.c:166:ompi_btl_openib_ini_query] Querying INI files for vendor 0x0000, part ID 0
[n137][[33322,1],7][btl_openib_ini.c:185:ompi_btl_openib_ini_query] Found corresponding INI values: default
[n137:04890] openib BTL: oob CPC only supported on InfiniBand; skipped on device nes0
[n137:04890] openib BTL: xoob CPC only supported with XRC receive queues; skipped on device nes0
[n137:04890] openib BTL: rdmacm CPC available for use on nes0
[n137:04890] select: init of component openib returned success
[n137:04890] select: initializing btl component self
[n137:04890] select: init of component self returned success
[n137:04890] select: initializing btl component sm
[n137:04890] select: init of component sm returned success
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is
an error; Open MPI requires that all MPI processes be able to reach
each other. This error can sometimes be the result of forgetting to
specify the "self" BTL.

  Process 1 ([[33322,1],0]) is on host: n130
  Process 2 ([[33322,1],5]) is on host: n132x
  BTLs attempted: openib self sm

Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is
an error; Open MPI requires that all MPI processes be able to reach
each other. This error can sometimes be the result of forgetting to
specify the "self" BTL.

  Process 1 ([[33322,1],2]) is on host: n134
  Process 2 ([[33322,1],5]) is on host: n132x
  BTLs attempted: openib self sm

Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is
an error; Open MPI requires that all MPI processes be able to reach
each other. This error can sometimes be the result of forgetting to
specify the "self" BTL.

  Process 1 ([[33322,1],5]) is on host: n132
  Process 2 ([[33322,1],0]) is on host: n130
  BTLs attempted: openib self sm

Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is
an error; Open MPI requires that all MPI processes be able to reach
each other. This error can sometimes be the result of forgetting to
specify the "self" BTL.

  Process 1 ([[33322,1],7]) is on host: n137
  Process 2 ([[33322,1],0]) is on host: n130
  BTLs attempted: openib self sm

Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is
an error; Open MPI requires that all MPI processes be able to reach
each other. This error can sometimes be the result of forgetting to
specify the "self" BTL.

  Process 1 ([[33322,1],3]) is on host: n135
  Process 2 ([[33322,1],5]) is on host: n132x
  BTLs attempted: openib self sm

Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is
an error; Open MPI requires that all MPI processes be able to reach
each other. This error can sometimes be the result of forgetting to
specify the "self" BTL.

  Process 1 ([[33322,1],6]) is on host: n133
  Process 2 ([[33322,1],0]) is on host: n130
  BTLs attempted: openib self sm

Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is
an error; Open MPI requires that all MPI processes be able to reach
each other. This error can sometimes be the result of forgetting to
specify the "self" BTL.

  Process 1 ([[33322,1],1]) is on host: n131
  Process 2 ([[33322,1],5]) is on host: n132x
  BTLs attempted: openib self sm

Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is
an error; Open MPI requires that all MPI processes be able to reach
each other. This error can sometimes be the result of forgetting to
specify the "self" BTL.

  Process 1 ([[33322,1],4]) is on host: n136
  Process 2 ([[33322,1],5]) is on host: n132x
  BTLs attempted: openib self sm

Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init_thread
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[n134:4888] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init_thread
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
*** An error occurred in MPI_Init_thread
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
*** An error occurred in MPI_Init_thread
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
*** An error occurred in MPI_Init_thread
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
*** An error occurred in MPI_Init_thread
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
*** An error occurred in MPI_Init_thread
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[n137:4890] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
[n135:4883] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
[n133:4850] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
[n136:4866] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
[n131:4866] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
[n132:4855] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
--------------------------------------------------------------------------
mpirun has exited due to process rank 3 with PID 4883 on
node n135x exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init_thread
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[n130:4885] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
[root_at_n130 scripts]#