Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] RFC: Linuxes shipping libibverbs
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-05-22 13:23:41


On May 22, 2008, at 11:53 AM, Pavel Shamis (Pasha) wrote:

> 1. Driver doesn't support the HCA - If I remember correct , RH40 by
> default doesn't support ConnectX hca . The device_list will be
> empty. It is very exotic case.
> 2. Driver version doesn't correspond with fw version
> 3. FW was broken
> 4. Driver was broken and failed to start - it is not very exotic
> case too. Some times user make some modification - upgrade/install/
> etc.. and it brakes driver.
>
>> In such cases, the ibv_devinfo(1) and ibv_devices(1) commands
>> would show the same error.
> Yep these utilities will show the same error.
>
> Cases 1-2-3 we may cover pretty simple. OPENIB driver creates "/dev/
> infiniband" during his startup. So if /dev/infiniband exists and
> _get_device_list() is empty we may print warning.

Ok, that seems reasonable.

> I don't know how we can cover case 4 :-(

If the user makes modifications to the driver and breaks it, I don't
think we can be held responsible for that -- prudence declares that
you should verify that your [self-modified] driver is not broken first
before blaming Open MPI. I'm not that concerned about #4; most of my
customers do not modify the drivers.

> BTW I think that problem is relevant for all BTLs and not only
> openib and may be we need look for some global solution.

Brian's solution was reasonable; perhaps just adding a flag to the
existing no_nics function.

-- 
Jeff Squyres
Cisco Systems