Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] Debugging Runtime/Ethernet Problems
From: Lloyd Brown (lloyd_brown_at_[hidden])
Date: 2013-09-20 13:00:33


1 - How do I check the BTLs available? Something like "ompi_info | grep
-i btl"? If so, here's the list:

> MCA btl: ofud (MCA v2.0, API v2.0, Component v1.6.3)
> MCA btl: openib (MCA v2.0, API v2.0, Component v1.6.3)
> MCA btl: self (MCA v2.0, API v2.0, Component v1.6.3)
> MCA btl: sm (MCA v2.0, API v2.0, Component v1.6.3)
> MCA btl: tcp (MCA v2.0, API v2.0, Component v1.6.3)

2 - The IP interfaces on all nodes are:
- em1 - Ethernet - IP in the 192.168.216.0/22 range
- ib0 - IPoIB (only on IB-enabled nodes) - IP in the 192.168.212.0/22 range
- lo - loopback - 127.0.0.1/8

And I think that Jeff is absolutely right. This syntax did work:

> mpirun --mca btl ^openib --mca btl_tcp_if_exclude 192.168.212.0/22,127.0.0.1/8 ./osu_bw

And this one too, which is basically equivalent in this case:

> mpirun --mca btl ^openib --mca btl_tcp_if_exclude ib0,lo ./osu_bw

It is interesting to me, though, that I need to explicitly exclude
lo/127.0.0.1 in this case, but when I'm on an Ethernet-only node, and I
just do the plain "mpirun ./appname", I don't have to exclude anything,
and it figures out to use em1, and not lo.

Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu

On 09/20/2013 10:31 AM, Jeff Squyres (jsquyres) wrote:
> On Sep 20, 2013, at 12:27 PM, Lloyd Brown <lloyd_brown_at_[hidden]> wrote:
>
>> Interesting. I was taking the approach of "only exclude what you're
>> certain you don't want" (the native IB and TCP/IPoIB stuff) since I
>> wasn't confident enough in my knowledge of the OpenMPI internals, to
>> know what I should explicitly include.
>>
>> However, taking Jeff's suggestion, this does seem to work, and gives me
>> the expected Ethernet performance:
>>
>> "mpirun --mca btl tcp,sm,self --mca btl_tcp_if_include em1 ./osu_bw"
>>
>> So, in short, I'm still not sure why my exclude syntax doesn't work.
>
> Check two things:
>
> 1. What BTLs are available? Is there some other BTL that may be used instead of openib?
>
> 2. (this one is more likely) What IP interfaces are available on all nodes? The most obvious guess here is that you didn't exclude 127.0.0.1/8, and OMPI found this interface on all nodes, and therefore assumed that it was routable/usable on all nodes. Hence, one quick experiment might be to try your exclude syntax again, but *also* exclude 127.0.0.8/8.
>