Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] OMPI-1.3.2, openib/iWARP(cxgb3) problem: PML add procs failed (Unreachable)
From: Ken Cain (kcain_at_[hidden])
Date: 2009-05-07 09:13:33


Jeff Squyres wrote:
> On May 6, 2009, at 4:45 PM, Ken Cain wrote:
>
>> Is it possible for OMPI to generate output at runtime indicating exactly
>> what btl(s) will be used?
>>
>
> At present, we only have a fairly lame system to do this. We wanted to
> print out a connection map in v1.3, but it didn't happen -- this feature
> has been re-targeted for v1.5:
>
> https://svn.open-mpi.org/trac/ompi/ticket/1207
>
> It's unfortunately a surprisingly complex issue; one reason that it's
> "hard" is that OMPI lazily makes connections and supports striping
> across multiple networks. Hence, to make a completely accurate map,
> OMPI has to guarantee to make *all* network connections and then gather
> all the connection information back to MPI_COMM_WORLD rank 0 to print out.
>
> What OMPI does today is that if you specifically ask for a high-speed
> network and we're unable to find one, we'll warn about it (because if
> you asked for it, you likely really want to use it -- if there isn't
> one, that's likely a problem). So if you:
>
> mpirun --mca btl openib,sm,self,tcp ...
>
> And OMPI doesn't find any active OpenFabrics ports, it'll print a warning.
>
>> Removing tcp below brings me back to the original simple command line
>> that fails with the output shown above (indicating that openib btl will
>> be disabled):
>>
>> mpirun --mca orte_base_help_aggregate 0 --mca btl openib,self --hostfile
>> ~/1usrv_ompi_machfile -np 2 ./NPmpi -p0 -l 1 -u 1024
>>
>
> It looks like you're having two problems:
>
> 1. The RDMACM connector in OMPI decides that it can't be used:
>
> mpirun --mca orte_base_help_aggregate 0 --mca btl openib,self --hostfile
> ~/1usrv_ompi_machfile -np 2 ./NPmpi -p0 -l 1 -u 1024 > outfile1 2>&1
>
> >
> --------------------------------------------------------------------------
> > No OpenFabrics connection schemes reported that they were able to be
> > used on a specific port. As such, the openib BTL (OpenFabrics
> > support) will be disabled for this port.
> >
> > Local host: aae1
> > Local device: cxgb3_0
> > CPCs attempted: oob, xoob, rdmacm
> >
> --------------------------------------------------------------------------
>
> *** Can you re-run this scenario with --mca btl_base_verbose 50? I'd
> like to see why the RDMA CM CPC disqualified itself.

Jeff, thank you very much for taking a look at this. I have re-run with
increased verbosity in three different scenarios:

1) simple command line with verbosity

mpirun --mca orte_base_help_aggregate 0 --mca btl_base_verbose 50 --mca
btl_openib_verbose 50 --mca btl openib,self --hostfile
~/1usrv_ompi_machfile -np 2 ./NPmpi -p0 -l 1 -u 1024 > ~/outfile3 2>&1

interesting output below indicates rdmacm IP address not found on port
(showing output of one rank below, but we get the same output by the
other MPI rank as well):
> [aae4:30924] openib BTL: oob CPC only supported on InfiniBand; skipped on device cxgb3_0
> [aae4:30924] openib BTL: xoob CPC only supported with XRC receive queues; skipped on device cxgb3_0
> [aae4:30924] openib BTL: rdmacm CPC available for use on cxgb3_0
> [aae4:30924] openib BTL: oob CPC only supported on InfiniBand; skipped on device cxgb3_0
> [aae4:30924] openib BTL: xoob CPC only supported with XRC receive queues; skipped on device cxgb3_0
> [aae4:30924] openib BTL: rdmacm IP address not found on port
> [aae4:30924] openib BTL: rdmacm CPC unavailable for use on cxgb3_0; skipped
> [aae4:30924] select: init of component openib returned failure
> [aae4:30924] select: module openib unloaded

2) more complex command line requesting to use cxgb3_0:1 (the one I
believe is physically connected + configured with an IP address):

mpirun --mca orte_base_help_aggregate 0 --mca btl openib,self --mca
btl_base_verbose 50 --mca btl_openib_verbose 50 --mca
btl_openib_if_include cxgb3_0:1 --mca btl_openib_cpc_include rdmacm
--mca btl_openib_device_type iwarp --mca btl_openib_max_btls 1 --mca
mpi_leave_pinned 1 --hostfile ~/1usrv_ompi_machfile -np 2 ./NPmpi -p0 -l
1 -u 1024 > ~/outfile4_cxgb3_0_port1 2>&1

output (one rank shown, both print the same pattern):
> [aae4:30929] select: initializing btl component openib
> [aae4:30929] openib BTL: rdmacm CPC available for use on cxgb3_0
> [aae4:30929] select: init of component openib returned success
but then!
> PML add procs failed
> --> Returned "Unreachable" (-12) instead of "Success" (0)

3) more complex command line requesting to use cxgb3_0:2 (the one I
believe is not physically connected and not configured with an IP address):

mpirun --mca orte_base_help_aggregate 0 --mca btl openib,self --mca
btl_base_verbose 50 --mca btl_openib_verbose 50 --mca
btl_openib_if_include cxgb3_0:2 --mca btl_openib_cpc_include rdmacm
--mca btl_openib_device_type iwarp --mca btl_openib_max_btls 1 --mca
mpi_leave_pinned 1 --hostfile ~/1usrv_ompi_machfile -np 2 ./NPmpi -p0 -l
1 -u 1024 > ~/outfile4_cxgb3_0_port2 2>&1

output (exhibited by both MPI ranks):
> [aae4:30949] select: initializing btl component openib
> [aae4:30949] openib BTL: rdmacm IP address not found on port
> [aae4:30949] openib BTL: rdmacm CPC unavailable for use on cxgb3_0; skipped

>
> 2. But if you specify the port and to only use the rdmacm connector
> (CPC), the RDMA CM CPC *does* become available (which is just weird -- I
> don't know why that would be different than the above case...), but then
> it decides that it cannot connect:
>
> mpirun --mca orte_base_help_aggregate 0 --mca btl openib,self,sm --mca
> btl_base_verbose 10 --mca btl_openib_verbose 10 --mca
> btl_openib_if_include cxgb3_0:1 --mca btl_openib_cpc_include rdmacm
> --mca btl_openib_device_type iwarp --mca btl_openib_max_btls 1 --mca
> mpi_leave_pinned 1 --hostfile ~/1usrv_ompi_machfile -np 2 ./NPmpi -p0 -l
> 1 -u 1024 > ~/outfile2 2>&1
>
> >...lots of output...
> > [aae4:19426] openib BTL: rdmacm CPC available for use on cxgb3_0
> >...lots of output...
> >
> --------------------------------------------------------------------------
> > At least one pair of MPI processes are unable to reach each other for
> > MPI communications. This means that no Open MPI device has indicated
> > that it can be used to communicate between these processes. This is
> > an error; Open MPI requires that all MPI processes be able to reach
> > each other. This error can sometimes be the result of forgetting to
> > specify the "self" BTL.
> >
> > Process 1 ([[3107,1],0]) is on host: aae4
> > Process 2 ([[3107,1],1]) is on host: aae1
> > BTLs attempted: openib self sm
> >
> > Your MPI job is now going to abort; sorry.
> >
> --------------------------------------------------------------------------
>
> *** Very strange. Can you send the output ibv_devinfo -v from both nodes?
>

Sure here it is:

[aae4:~] ibv_devinfo -v
hca_id: cxgb3_0
         fw_ver: 7.1.0
         node_guid: 0007:4305:58dd:0000
         sys_image_guid: 0007:4305:58dd:0000
         vendor_id: 0x1425
         vendor_part_id: 49
         hw_ver: 0x1
         board_id: 1425.31
         phys_port_cnt: 2
         max_mr_size: 0x100000000
         page_size_cap: 0xffff000
         max_qp: 32736
         max_qp_wr: 1023
         device_cap_flags: 0x00228000
         max_sge: 4
         max_sge_rd: 1
         max_cq: 32767
         max_cqe: 8192
         max_mr: 32768
         max_pd: 32767
         max_qp_rd_atom: 8
         max_ee_rd_atom: 0
         max_res_rd_atom: 0
         max_qp_init_rd_atom: 8
         max_ee_init_rd_atom: 0
         atomic_cap: ATOMIC_NONE (0)
         max_ee: 0
         max_rdd: 0
         max_mw: 0
         max_raw_ipv6_qp: 0
         max_raw_ethy_qp: 0
         max_mcast_grp: 0
         max_mcast_qp_attach: 0
         max_total_mcast_qp_attach: 0
         max_ah: 0
         max_fmr: 0
         max_srq: 0
         max_pkeys: 0
         local_ca_ack_delay: 0
                 port: 1
                         state: PORT_ACTIVE (4)
                         max_mtu: 4096 (5)
                         active_mtu: 2048 (4)
                         sm_lid: 0
                         port_lid: 0
                         port_lmc: 0x00
                         max_msg_sz: 0xffffffff
                         port_cap_flags: 0x009f0000
                         max_vl_num: invalid value (0)
                         bad_pkey_cntr: 0x0
                         qkey_viol_cntr: 0x0
                         sm_sl: 0
                         pkey_tbl_len: 1
                         gid_tbl_len: 1
                         subnet_timeout: 0
                         init_type_reply: 0
                         active_width: 4X (2)
                         active_speed: 5.0 Gbps (2)
                         phys_state: invalid physical state (0)

                 port: 2
                         state: PORT_ACTIVE (4)
                         max_mtu: 4096 (5)
                         active_mtu: 2048 (4)
                         sm_lid: 0
                         port_lid: 0
                         port_lmc: 0x00
                         max_msg_sz: 0xffffffff
                         port_cap_flags: 0x009f0000
                         max_vl_num: invalid value (0)
                         bad_pkey_cntr: 0x0
                         qkey_viol_cntr: 0x0
                         sm_sl: 0
                         pkey_tbl_len: 1
                         gid_tbl_len: 1
                         subnet_timeout: 0
                         init_type_reply: 0
                         active_width: 4X (2)
                         active_speed: 5.0 Gbps (2)
                         phys_state: invalid physical state (0)

[aae1:~] ibv_devinfo -v
hca_id: cxgb3_0
         fw_ver: 7.1.0
         node_guid: 0007:4305:45ae:0000
         sys_image_guid: 0007:4305:45ae:0000
         vendor_id: 0x1425
         vendor_part_id: 49
         hw_ver: 0x1
         board_id: 1425.31
         phys_port_cnt: 2
         max_mr_size: 0x100000000
         page_size_cap: 0xffff000
         max_qp: 32736
         max_qp_wr: 1023
         device_cap_flags: 0x00228000
         max_sge: 4
         max_sge_rd: 1
         max_cq: 32767
         max_cqe: 8192
         max_mr: 32768
         max_pd: 32767
         max_qp_rd_atom: 8
         max_ee_rd_atom: 0
         max_res_rd_atom: 0
         max_qp_init_rd_atom: 8
         max_ee_init_rd_atom: 0
         atomic_cap: ATOMIC_NONE (0)
         max_ee: 0
         max_rdd: 0
         max_mw: 0
         max_raw_ipv6_qp: 0
         max_raw_ethy_qp: 0
         max_mcast_grp: 0
         max_mcast_qp_attach: 0
         max_total_mcast_qp_attach: 0
         max_ah: 0
         max_fmr: 0
         max_srq: 0
         max_pkeys: 0
         local_ca_ack_delay: 0
                 port: 1
                         state: PORT_ACTIVE (4)
                         max_mtu: 4096 (5)
                         active_mtu: 2048 (4)
                         sm_lid: 0
                         port_lid: 0
                         port_lmc: 0x00
                         max_msg_sz: 0xffffffff
                         port_cap_flags: 0x009f0000
                         max_vl_num: invalid value (0)
                         bad_pkey_cntr: 0x0
                         qkey_viol_cntr: 0x0
                         sm_sl: 0
                         pkey_tbl_len: 1
                         gid_tbl_len: 1
                         subnet_timeout: 0
                         init_type_reply: 0
                         active_width: 4X (2)
                         active_speed: 5.0 Gbps (2)
                         phys_state: invalid physical state (0)

                 port: 2
                         state: PORT_ACTIVE (4)
                         max_mtu: 4096 (5)
                         active_mtu: 2048 (4)
                         sm_lid: 0
                         port_lid: 0
                         port_lmc: 0x00
                         max_msg_sz: 0xffffffff
                         port_cap_flags: 0x009f0000
                         max_vl_num: invalid value (0)
                         bad_pkey_cntr: 0x0
                         qkey_viol_cntr: 0x0
                         sm_sl: 0
                         pkey_tbl_len: 1
                         gid_tbl_len: 1
                         subnet_timeout: 0
                         init_type_reply: 0
                         active_width: 4X (2)
                         active_speed: 5.0 Gbps (2)
                         phys_state: invalid physical state (0)

-Ken

This message is intended only for the designated recipient(s) and may
contain confidential or proprietary information of Mercury Computer
Systems, Inc. This message is solely intended to facilitate business
discussions and does not constitute an express or implied offer to sell
or purchase any products, services, or support. Any commitments must be
made in writing and signed by duly authorized representatives of each
party.