Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] MPI_Finalize() maintains load at 100%.
From: Jeff Squyres (jsquyres) (jsquyres_at_[hidden])
Date: 2014-05-24 07:56:52


Sorry to jump in late on this thread, but here's my thoughts:

1. Your initial email said "threads", not "processes". I assume you actually meant "processes" (having multiple threads calls MPI_FINALIZE is erroneous).

2. Periodically over the years, we have gotten the infrequent request to support some form of MPI progress that does not consume 100% of the CPU. Frankly, there's never been enough demand to justify the work that it will require (remember: the common case is highest possible performance, which demands 100% of the CPU -- having a "slow path" and a "fast path" can be somewhat intrusive here, and can hurt the "fast path"). FWIW: MPI_FINALIZE is just another MPI call, and it still potentially needs to make progress on MPI message passing, so it follows the same progression behavior as all other MPI calls.

3. Also remember that in Open MPI, MPI_FINALIZE is likely to block, anyway, until everyone calls it.

4. If have an imbalance like this (processes call MPI_FINALIZE at different times) and really can't abide 100% CPU usage, then there are a few schemes you might try to mitigate it:

- use a non-blocking barrier (MPI_IBARRIER) and periodically MPI_TEST to see if everyone has reached that point -- interlacing that MPI_TEST in the middle of real work. Once the MPI_IBARRIER request completes, everyone call MPI_FINALIZE.

- use some kind of algorithm to calculate when everyone can call MPI_FINALIZE (i.e., an absolute time -- assuming all your compute nodes are NTP-synchronized). Then do real work, but call MPI_FINALIZE at the prescribed time.

On May 23, 2014, at 3:08 PM, Ralph Castain <rhc_at_[hidden]> wrote:

> Hmmm...okay, good news and bad news :-)
>
> Good news: this works fine on 1.8, so I'd suggest updating to that release series (either 1.8.1 or the nightly 1.8.2)
>
> Bad news: if one proc is going to exit without calling Finalize, they all need to do so else you will hang in Finalize. The problem is that Finalize invokes a barrier, and some of the procs aren't there any more to participate.
>
>
> On May 23, 2014, at 12:03 PM, Ralph Castain <rhc_at_[hidden]> wrote:
>
>> I'll check to see - should be working
>>
>> On May 23, 2014, at 8:07 AM, Iván Cores González <ivan.coresg_at_[hidden]> wrote:
>>
>>>> I assume you mean have them exit without calling MPI_Finalize ...
>>>
>>> Yes, thats my idea, exit some processes while the others continue. I am trying to
>>> use the "orte_allowed_exit_without_sync" flag in the next code (note that the code
>>> is different):
>>>
>>> int main( int argc, char *argv[] )
>>> {
>>> MPI_Init(&argc, &argv);
>>>
>>> int myid;
>>> MPI_Comm_rank(MPI_COMM_WORLD, &myid);
>>>
>>> if (myid == 0)
>>> {
>>> printf("Exit P0 ...\n");
>>> //With "--mca orte_allowed_exit_without_sync 1" this
>>> //process should die, but not P1, P2 ... , is ok?
>>> exit(0);
>>> }
>>>
>>> //Imagine some important job here
>>> sleep(20);
>>>
>>> printf("Calling MPI_Finalize() ...\n");
>>> // Process 0 maintain load at 100%.
>>> MPI_Finalize();
>>> return 0;
>>> }
>>>
>>> and the cmd:
>>> mpirun --mca orte_allowed_exit_without_sync 1 -hostfile ./hostfile -np 2 --prefix /share/apps/openmpi/gcc/ib/1.7.2 ./a.out
>>>
>>> But it does not work, all job fails in the "exit(0)" call. Maybe I don't undertand your response...
>>>
>>>
>>> Sorry for not response in order, I have some problems with my
>>> e-mail receiving the Open-MPI mails.
>>>
>>>> In my codes, I am using MPI_Send and MPI_Recv functions to notify P0 that
>>>> every other process have finished their own calculations. Maybe you cal
>>>> also use the same method and keep P0 in waiting until it receives some data
>>>> from other processes?
>>>
>>> This solution was my first idea, but I can't do it. I use spawned processes and
>>> different communicators for manage "groups" of processes, so the ideal behaviour
>>> is that processes finished and died (or at least don't stay at 100% load) when
>>> their finish their work. Its a bit hard to explain.
>>>
>>>
>>>
>>>
>>> ----- Mensaje original -----
>>> De: "Ralph Castain" <rhc_at_[hidden]>
>>> Para: "Open MPI Users" <users_at_[hidden]>
>>> Enviados: Viernes, 23 de Mayo 2014 16:39:34
>>> Asunto: Re: [OMPI users] MPI_Finalize() maintains load at 100%.
>>>
>>>
>>> On May 23, 2014, at 7:21 AM, Iván Cores González <ivan.coresg_at_[hidden]> wrote:
>>>
>>>> Hi Ralph,
>>>> Thanks for your response.
>>>> I see your point, I try to change the algorithm but some processes finish while the others are still calling MPI functions. I can't avoid this behaviour.
>>>> The ideal behavior is the processes go to sleep (or don't use the 100% of load) when the MPI_Finalize is called.
>>>>
>>>> For the time being maybe the fastest solution is instert a "manual" sleep before the MPI_Finalize.
>>>>
>>>> Another question, Could be possible kill some MPI processes and avoid that the mpirun fails? Or this behaviuor is impossible?
>>>
>>> I assume you mean have them exit without calling MPI_Finalize, so they don't block? Technically, yes, though we wouldn't recommend that behavior. You can add "-mca orte_allowed_exit_without_sync 1" to your cmd line (or set the mca param in your environment, etc.) and mpirun won't terminate you if a proc exits without calling MPI_Finalize. We will still, however, terminate the job if (a) a proc dies by signal (e.g., segfaults), or (b) a proc exits with a non-zero status, so you'll still have some protection from hangs.
>>>
>>>>
>>>> Thanks,
>>>> Ivan Cores
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/