Hi,
When the asynchronous events device handler(btl_openib_async_deviceh())
gets an async event and XRC is enabled, the XRC bit is cleared to
process the event_type value, but orte_show_help is called with the
event_type original value (i.e. XRC bit still present). This leads to
the following kind of message:
----------------------------------------------------------
The OpenFabrics stack has reported a network error event. Open MPI
will try to continue. but your job may end up failing.
Local host: XXXX
MPI process PID: 31818
Error number: -2147483645 (UNKNOWN)
This error may indicate connectivity problems within the fabric;
please contact your system administrator
-----------------------------------------------------------
While the expected error number is
Error number: 3 (IBV_EVENT_QP_ACCESS_ERR)
I propose the attached small patch to fix this issue.
Regards,
Nadia
--
nadia.derbey <Nadia.Derbey_at_[hidden]>
|