Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] RFC: Linuxes shipping libibverbs
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-05-22 07:21:02


On May 22, 2008, at 6:50 AM, Terry Dontje wrote:

>> Brian and I chatted a bit about this off-list, and I think we're in
>> agreement now:
>>
>> - do not change the default value or meaning of
>> btl_base_want_component_unsed.
>>
>> - major point of confusion: the openib BTL is actually fairly unique
>> in that it can (and does) tell the difference between "there are no
>> devices present" and "there are devices, but something went wrong".
>> Other BTL's have network interfaces that can't tell the difference
>> and
>> can *only* call the no_nics function, regardless of whether there are
>> no relevant network interfaces or some error occurred during
>> initialization.
>>
>> - so a reasonable solution would be an openib-BTL-specific mechanism
>> that doesn't call the no_nics function (to display that
>> btl_base_want_component_unused) if there are no verbs-capable devices
>> found because of the fact that mainline Linuxes are starting to ship
>> libibverbs. Specific mechanism TBD; likely to be an openib MCA
>> param.
>>
> So, if you are delivering something similar to a BTL for myrinet you
> will see the message but
> the belief is this is necessary since there isn't enough granularity
> in
> the error reporting of the
> device to feel comfortable enough as to whether the user want the
> device
> to be used?

The major difference here is that libmyriexpress is not being included
in mainline Linux distributions. Specifically: if you can find/use
libmyriexpress, it's likely because you have that hardware. The same
*used* to be true for libibverbs, but is no longer true because Linux
distros are now shipping (e.g., the Debian distribution pulls in
libibverbs when you install Open MPI).

> Won't udapl have a similar issue here or does it not get built by
> default when OFED is built?

We decided that under Linux, the udapl BTL does not get built by
default (even if it could) because then an "mpirun a.out" by default
would use both UDAPL and verbs, which is undesirable for several
reasons. There's Linux-specific logic to this effect in config/
ompi_check_udapl.m4.

> FWIW, our distribution actually turns off
> btl_base_want_component_unused
> because it seemed
> the majority of our cases would be that users would false positive
> sights of the message.

Is the UDAPL library shipped in Solaris by default? If so, then
you're likely in exactly the same kind of situation that I'm
describing. The same will be true if Solaris ends up shipping
libibverbs by default.

-- 
Jeff Squyres
Cisco Systems