Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] RFC: Linuxes shipping libibverbs
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-05-22 07:30:53

On May 22, 2008, at 5:26 AM, Pavel Shamis (Pasha) wrote:

> Ok, we will have own warning mechanism. But we still open question,
> Will
> we show (by default) error message in case
> when libibverbs exists but it is no hca in the hca_list ?
> I think we should show the error. The problem of libibverbs default
> install is relevant only for
> binary distribution, that install all ompi dependences with ompi
> package. In this case
> distribution will have openib mca parameter that will allow to disable
> by default the warning message
> during ompi package install (or build).
> I guess that most people still install ompi from sources. And in this
> case it sound reasonable for me
> to print this "no hca" warning it openib btl was build.

I'm not sure I follow this logic -- can you explain more?

Why does this only apply to binary distribution? If libibverbs is
installed by default, then OMPI will still build the openib BTL (and
therefore warn if it's not used). Granted, some distros will only
install libibverbs if either explicitly or implicitly requested (e.g.,
via dependency). What if some other dependency pulls in libibverbs,
even if OMPI was built from a source tarball?

Let me ask another question: is it common to have the verbs stack /
hardware so hosed up that ibv_get_device_list() returns an empty list
when there really is a device there? My assumption is that this is
quite uncommon; that ibv_get_device_list() will usually return that
there *are* devices and errors show up later during initialization,
etc. Never say "never", of course; I'm sure that there are degenerate
corner cases where a badly hosed device will cause
ibv_get_device_list() to return an empty list -- but I'm assuming that
those cases are very few and far between. In such cases, the
ibv_devinfo(1) and ibv_devices(1) commands would show the same error.

Keep in mind that I'm *only* talking about disabling the default
warning from the openib btl when ibv_get_device_list() returns an
empty list (and there will be an option to enable it if you want,
which we can set via OFED/other packaging for those who want/need
it). All other warnings/errors will remain exactly as they are.

Jeff Squyres
Cisco Systems