Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

From: Jonathan Underwood (jonathan.underwood_at_[hidden])
Date: 2007-06-11 19:05:37


Hi Adrian,

On 11/06/07, Adrian Knoth <adi_at_[hidden]> wrote:
> Which OMPI version?
>

1.2.2

> > $ perl -e 'die$!=110'
> > Connection timed out at -e line 1.
>
> Looks pretty much like a routing issue. Can you sniff on eth1 on the
> frontend node?
>

I don't have root access, so am afraid not.

> > This error message occurs the first time one of the compute nodes,
> > which are on a private network, attempts to send data to the frontend
>
> > In actual fact, it seems that the error occurs the first time a
> > process on the frontend tries to send data to another process on the
> > frontend.
>
> What's the exact problem? compute-node -> frontend? I don't think you
> have two processes on the frontend node, and even if you do, they should
> use shared memory.
>
> > Any advice would be very welcome
>
> Use tcpdump and/or recompile with debug enabled. In addition, set
> WANT_PEER_DUMP in ompi/mca/btl/tcp/btl_tcp_endpoint.c to 1 (line 120)
> and recompile, thus giving you more debug output.
>
> Depending on your OMPI version, you can also add
>
> mpi_preconnect_all=1
>
> to your ~/.openmpi/mca-params.conf, by this establishing all connections
> during MPI_Init().
>

OK, will try these things.

> If nothing helps, exclude the frontend from computation.
>
>

OK.

Thanks for the suggestions!

Joanthan