I am running application that performs some transformations of large matrices on 7-node cluster. Nodes are connected via QDR 40 Gbit Infiniband. Open MPI 1.4.3 is installed on the system.
Given matrix transformation requires large data exchange between nodes in such a way that at each algorithm step there is one node that sends data and all others receive. Number of processes is equal to number of nodes used. I have to say that I am relatively new at MPI, but it seemed that ideal way of performing this is by using MPI_Bcast.
Everything worked fine for some not so large matrices. However, when matrix size increases, at some point application hangs and stays there forever.
I am not completely sure, but it seems like there is no errors in my code. I traced it in detail in order to check if there are some uncompleted collective operations before that specific call of MPI_Bcast, but everything looks fine. Also, for that specific call, root is correctly set in all processes, as well as message type and size, and, of course, MPI_Bcast is called in all processes.
I also ran a lot of scenarios (running application on matrices of different sizes and changing the number of processes) in order to figure out when this happens. What can be observed is the following:
So, it seems to me that problem could be in some buffers that MPI
uses, and maybe some default MCA parameter should be changed, but, as I
said, I do not have a lot of experience in MPI programming, and I have
not found solution for this problem.
So, the question is whether anyone has had a similar problem, and maybe
knows if this could be solved by setting appropriate MCA parameter, or
knows any other solution or explanation?