Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Using Boost::Thread for multithreading within OpenMPI processes
From: Thomas Watson (exascale.system_at_[hidden])
Date: 2013-04-24 11:47:10


Thanks Jeff! That's very helpful.

Cheers!

Jacky

On Wed, Apr 24, 2013 at 10:56 AM, Jeff Squyres (jsquyres) <
jsquyres_at_[hidden]> wrote:

> On Apr 24, 2013, at 10:24 AM, Thomas Watson <exascale.system_at_[hidden]>
> wrote:
>
> > I still have a couple of questions to ask:
> >
> > 1. In both MPI_THREAD_FUNNELED and MPI_THREAD_SERIALIZED modes, the MPI
> calls are serialized at only one thread (in the former case, only the rank
> main thread can make MPI calls, while in the latter case the threads need
> to be coordinated so that only one thread makes MPI calls at a time). So
> are there any performance implications associated with choosing between
> FUNNELED or SERIALIZED?
>
> In Open MPI, no.
>
> > 2. My current code uses many MPI collective calls
> (gather/scatter/broadcast, etc.). It seems that these collective calls have
> some negative impact on performance because ALL MPI processes need to wait
> on each of these calls. I would like to explore the idea of decoupling
> computation from MPI communication - so if one thread of each MPI rank is
> blocked at a MPI call, the other threads can still make progress. I am
> wondering if I could still make MPI calls from the other non-blocked
> threads using MPI_THREAD_FUNNELED or MPI_THREAD_SERIALIZED mode (assuming
> that the blocked thread is the main thread in the rank)?
>
> MPI-3 introduced the concept of non-blocking collectives (e.g.,
> MPI_Igather). Open MPI 1.7.x has preliminary versions of these, but the
> implementations concentrated on correctness: they haven't been optimized
> yet. You might need to check how well MPI_Gather performs in a separate
> thread vs. MPI_Igather.
>
> Also, be aware that not all collectives are synchronizing. Depending on
> the back-end algorithm that is used to implement any given collective, one
> MPI process may return much earlier from a collective call than one of its
> peers in the same collective call. For example, with MPI_Gather of a short
> message, all non-root processes might do an eager send and return
> more-or-less immediately. The root will need to block, however, until all
> messages are received.
>
> Make sense?
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>