Ole Nielsen wrote:
> Thanks for your suggestion Gus, we need a way of debugging what is going
> on. I am pretty sure the problem lies with our cluster configuration. I
> know MPI simply relies on the underlying network. However, we can ping
> and ssh to all nodes (and in between and pair as well) so it is
> currently a mystery why MPI doesn't communicate across nodes on our cluster.
> Two further questions for the group
> 1. I would love to run the test program connectivity.c, but cannot
> find it anywhere. Can anyone help please?
If you downloaded the OpenMPI tarball, it is in examples/connectivity.c
wherever you untarred it [now where you installed].
> 2. After having left the job hanging over night we got the message
> mca_btl_tcp_frag_recv: readv failed: Connection timed out (110).
> Does anyone know what this means?
> Cheers and thanks
> PS - I don't see how separate buffers would help. Recall that the test
> program I use works fine on other installations and indeed when run on
> one the cores of one Node.
It probably won't help, as Eugene explained.
Your program works here, worked also for Davendra Rai.
If you were using MPI_ISend [non-blocking],
then you would need separate buffers.
For large amounts of data and many processes,
I would rather use non-blocking communication [and separate
buffers], specially if you do work between send and recv.
But that's not what hangs your program.
> Message: 11
> Date: Mon, 19 Sep 2011 10:37:02 -0400
> From: Gus Correa <gus_at_[hidden] <mailto:gus_at_[hidden]>>
> Subject: Re: [OMPI users] RE : MPI hangs on multiple nodes
> To: Open MPI Users <users_at_[hidden] <mailto:users_at_[hidden]>>
> Message-ID: <4E77538E.3070007_at_[hidden]
> Content-Type: text/plain; charset=iso-8859-1; format=flowed
> Hi Ole
> You could try the examples/connectivity.c program in the
> OpenMPI source tree, to test if everything is alright.
> It also hints how to solve the buffer re-use issue
> that Sebastien [rightfully] pointed out [i.e., declare separate
> buffers for MPI_Send and MPI_Recv].
> Gus Correa
> users mailing list