Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Need help resolving No route to host error with OpenMPI 1.1.2
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-09-11 14:52:28

On Sep 11, 2008, at 2:38 PM, Eric Thibodeau wrote:

> In short:
> Which of the 3 options is the one known to be unstable in the
> following:
> --enable-mpi-threads Enable threads for MPI applications (default:
> disabled)
> --enable-progress-threads
> Enable threads asynchronous communication
> progress
> (default: disabled)
> --with-threads Set thread type (solaris / posix)

You shouldn't need to specify any of these.

> In long (rationale):
> Just to make sure we don't contradict each other, you're
> suggesting the use of 'listen_thread' but, at the same time I'm
> telling Prasanna to _disable_ threads the threads USE flag which
> translates into the following logic (in the package):

Heh; yes, it's a bit confusing -- I apologize.

The "threads" that I'm saying don't work is the MPI multi-threaded
support (i.e., MPI_THREAD_MULTIPLE) and support for progress threads
within MPI's progression engine.

What *does* work is a tiny threaded TCP listener for incoming
connections. Since the processing for each TCP connection takes a
little time, we found that for scalability reasons, it was good to
have a tiny thread that does nothing but block on TCP accept(), get
the connection, and then hand it off to the main back-end thread for
processing. This allows our accept() rate to be quite high, even if
the actual processing is slower. *This* is the "listen_thread" mode,
and turns out to be quite necessary for running at scale because our
initial wireup coordination occurs over TCP -- there's a flood of
incoming TCP connections back to the starter. With the threaded TCP
listener, the accept rate is high enough to not cause timeouts for the
incoming TCP flood.

Hope that made sense...

Jeff Squyres
Cisco Systems