We managed to have another user hit the bug that causes collectives (this time MPI_Bcast() ) to hang on IB that was fixed by setting:
btl_openib_cpc_include rdmacm
My question is if we set this to the default on our system with an environment variable does it introduce any performance or other issues we should be aware of?
Is there a reason we should not use rdmacm?
Thanks!
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
brockp_at_[hidden]
(734)936-1985
|