Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] MPI_Bcast hanging after some amount of data transferred on Infiniband network
From: Dusan Zoric (dusan.zoric_at_[hidden])
Date: 2013-07-26 03:15:40


I am running application that performs some transformations of large
matrices on 7-node cluster. Nodes are connected via QDR 40 Gbit Infiniband.
Open MPI 1.4.3 is installed on the system.

Given matrix transformation requires large data exchange between nodes in
such a way that at each algorithm step there is one node that sends data
and all others receive. Number of processes is equal to number of nodes
used. I have to say that I am relatively new at MPI, but it seemed that
ideal way of performing this is by using MPI_Bcast.

Everything worked fine for some not so large matrices. However, when matrix
size increases, at some point application hangs and stays there forever.

I am not completely sure, but it seems like there is no errors in my code.
I traced it in detail in order to check if there are some uncompleted
collective operations before that specific call of MPI_Bcast, but
everything looks fine. Also, for that specific call, root is correctly set
in all processes, as well as message type and size, and, of course,
MPI_Bcast is called in all processes.

I also ran a lot of scenarios (running application on matrices of different
sizes and changing the number of processes) in order to figure out when
this happens. What can be observed is the following:

   - for the matrix of the same size, application successfully finishes if
   I decrees number of processes
   - however, for given number of processes application will hang for some
   slightly larger matrix
   - for the given matrix size and number of processes where I have program
   hanging, if I reduce the size of the message in each MPI_Bcat call twice
   (of course the result will not be correct), there will not be hanging

So, it seems to me that problem could be in some buffers that MPI uses, and
maybe some default MCA parameter should be changed, but, as I said, I do
not have a lot of experience in MPI programming, and I have not found
solution for this problem. So, the question is whether anyone has had a
similar problem, and maybe knows if this could be solved by setting
appropriate MCA parameter, or knows any other solution or explanation?

Thanks,
Dusan Zoric