Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] busy waiting and oversubscriptions
From: Ross Boylan (ross_at_[hidden])
Date: 2014-03-26 12:08:34

On Wed, 2014-03-26 at 10:27 +0000, Jeff Squyres (jsquyres) wrote:
> On Mar 26, 2014, at 1:31 AM, Andreas Schäfer <gentryx_at_[hidden]> wrote:
> >> Even when "idle", MPI processes use all the CPU. I thought I remember
> >> someone saying that they will be low priority, and so not pose much of
> >> an obstacle to other uses of the CPU.
> >
> > well, if they're blocking in an MPI call, then they'll be doing a busy
> > wait, so each thread will easily churn up 100% CPU load.
> +1
This seems to restate the premise of my question. Is it meant to lead
to the answer "A process in busy wait blocks other users of the CPU to
the same extent as any other process at 100%"?
> >> At any rate, my question is whether, if I have processes that spend most
> >> of their time waiting to receive a message, I can put more of them than
> >> I have physical cores without much slowdown?
> >
> > AFAICS there will always be a certain slowdown. Is there a reason why
> > you would want to oversubscribe your nodes?
> Agreed -- this is not a good idea. It suggests that you should make your existing code more efficient -- perhaps by overlapping communication and computation.
My motivation was to get more work done with a given number of CPUs, and
also to find out how much of burden I was imposing on other users.

My application consists of processes that have different roles. Some of
the roles don't have much to do (they play important roles, but don't do
much computation). My hope was that I could add them on without
imposing much of a burden.

Second, we do not operate in a batch queuing environment and so
different users can end up sharing a CPU (though we try to avoid it).
So I was wondering whether my "idle but busy waiting" processes would
really get in the way of others.

Finally, overlapping communication and computation is a bit tricky. The
recent thread I started about Isend indicates that communication
requires the involvement of both the sender and receiver processes and
if one of them is busy with computation it can really slow things down.
I seem to have gotten good results by using Isend generally, in
particular when sending messages to the heavy computing processes, and
Send when sending from those same processes.