On May 16, 2010, at 1:32 PM, Lydia Heck wrote:
> When running over gigabit using -mca btl tcp,self,sm the code runs alright, which is good as the largest part of our cluster is over gigabit, and as Gadget-3 scales rather well, the penalty for running over gigabit is not prohibitive.
> We also have a myrinet cluster and on there larger runs freeze. However as the gigabit cluster was available we have not really investigated this until just now.
I can't help with the IB issue, but I am interested in the issue running over MX.
I found a ticket from 2007 regarding Gadget-2. The last set of mails indicated that the app was running. You have had a few tickets since, but none mentioned Gadget. Can you give me more details about the hang that you experienced?
I have a couple of ideas that we could investigate (one in Open-MPI and the other in MX).