On Jun 28, 2012, at 8:04 PM, Yong Qin wrote:
> Thanks to Jeff, we now have a bug registered with the segv issue.
There may be some confusion here with the fact that OMPI supports 2 different MX transports: an MTL and a BTL. Here's what the README says:
- Myrinet MX (and Open-MX) support is shared between the 2 internal
devices, the MTL and the BTL. The design of the BTL interface in
Open MPI assumes that only naive one-sided communication
capabilities are provided by the low level communication layers.
However, modern communication layers such as Myrinet MX, InfiniPath
PSM, or Portals, natively implement highly-optimized two-sided
communication semantics. To leverage these capabilities, Open MPI
provides the "cm" PML and corresponding MTL components to transfer
messages rather than bytes. The MTL interface implements a shorter
code path and lets the low-level network library decide which
protocol to use (depending on issues such as message length,
internal resources and other parameters specific to the underlying
interconnect). However, Open MPI cannot currently use multiple MTL
modules at once. In the case of the MX MTL, process loopback and
on-node shared memory communications are provided by the MX library.
Moreover, the current MX MTL does not support message pipelining
resulting in lower performances in case of non-contiguous
The "ob1" and "csum" PMLs and BTL components use Open MPI's internal
on-node shared memory and process loopback devices for high
performance. The BTL interface allows multiple devices to be used
simultaneously. For the MX BTL it is recommended that the first
segment (which is as a threshold between the eager and the
rendezvous protocol) should always be at most 4KB, but there is no
further restriction on the size of subsequent fragments.
The MX MTL is recommended in the common case for best performance on
10G hardware when most of the data transfers cover contiguous memory
layouts. The MX BTL is recommended in all other cases, such as when
using multiple interconnects at the same time (including TCP), or
transferring non contiguous data-types.
If you want to use the MX MTL, it may be simplest to simply remove the MX BTL plugin from your installation directory. That way, it *should* auto-select the MX MTL when you have machines with MX, and when you're on machines that do not have MX but do have OpenFabrics devices, it should auto-select the openib BTL.
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/