>> 1. Driver doesn't support the HCA - If I remember correct , RH40 by
>> default doesn't support ConnectX hca . The device_list will be empty.
>> It is very exotic case.
>> 2. Driver version doesn't correspond with fw version
>> 3. FW was broken
>> 4. Driver was broken and failed to start - it is not very exotic case
>> too. Some times user make some modification - upgrade/install/etc..
>> and it brakes driver.
>>> In such cases, the ibv_devinfo(1) and ibv_devices(1) commands would
>>> show the same error.
>> Yep these utilities will show the same error.
>> Cases 1-2-3 we may cover pretty simple. OPENIB driver creates
>> "/dev/infiniband" during his startup. So if /dev/infiniband exists
>> and _get_device_list() is empty we may print warning.
> Ok, that seems reasonable.
>> I don't know how we can cover case 4 :-(
> If the user makes modifications to the driver and breaks it, I don't
> think we can be held responsible for that -- prudence declares that
> you should verify that your [self-modified] driver is not broken first
> before blaming Open MPI. I'm not that concerned about #4; most of my
> customers do not modify the drivers.
Agree about #4.
The check for /dev/infiniband should be simple and I think we can add it
to 1.3 .