I have a users code that appears to be hanging some times on MPI_Waitall(), stack trace from padb below. It is on qlogic IB using the psm mtl.
Without knowing what requests go to which rank, how can I check that this code didn't just get its self into a deadlock? Is there a way to get a reable list of every ranks posted sends? And then query an wiating MPI_Waitall() of a running job to get what rends/recvs it is waiting on?
CAEN Advanced Computing