Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] CPU burning in Wait state
From: George Bosilca (bosilca_at_[hidden])
Date: 2008-09-03 13:47:52


This program is 100% correct from MPI perspective. However, in Open
MPI (and I think most of the others MPI), a collective communication
is something that will drain most of the resources, similar to all
blocking functions.

Now I will answer to your original post. Using non blocking
communications in this particular case, will give you a benefit as the
data involved in the communications is small enough to achieve a
perfect overlap. In the case you're trying to do exactly the same with
larger data, using non blocking communications will negatively impact
the performances, as MPI is not supposed to communicate when the user
application is not in an MPI call.

   george.

On Sep 3, 2008, at 6:32 PM, Vincent Rotival wrote:

> Ok let's take the simple example here, I might have use wrong terms
> and I apologize for it
>
> While the rank 0 process is sleeping the other ones are in bcast
> waiting for data
>
>
>
> program test
> use mpi
> implicit none
>
> integer :: mpi_wsize, mpi_rank, mpi_err
> integer :: data
>
> call mpi_init(mpi_err)
> call mpi_comm_size(MPI_COMM_WORLD, mpi_wsize, mpi_err)
> call mpi_comm_rank(MPI_COMM_WORLD, mpi_rank, mpi_err)
> if(mpi_rank.eq.0) then
> call sleep(100)
> data = 10
> end if
>
> call mpi_bcast(data, 1, MPI_INTEGER, 0, MPI_COMM_WORLD, mpi_err)
>
> print *, "Done in #", mpi_rank, " => data=", data
>
> end program test
>
>
> George Bosilca wrote:
>>
>> On Sep 3, 2008, at 6:11 PM, Vincent Rotival wrote:
>>
>>> Eugene,
>>>
>>> No what I'd like is that when doing something like
>>>
>>> call mpi_bcast(data, 1, MPI_INTEGER, 0, .....)
>>>
>>> the program continues AFTER the Bcast is completed (so no control
>>> returned to user), but while threads with rank > 0 are waiting in
>>> Bcast they are not taking CPU resources
>>
>> Threads with rank > 0 ? Now, this scares me !!! If all your threads
>> are going in the bcast, then I guess the application is not correct
>> from the MPI standard perspective (i.e. on each communicator there
>> is only one collective at every moment). In MPI, each process (and
>> not each thread) has a rank, and each process exists in each
>> communicator only once. In other words, as each collective is
>> bounded to a specific communicator, on each of your processes, only
>> one thread should go in the MPI_Bcast, if you want only ONE
>> collective.
>>
>> george.
>>
>>>
>>>
>>> I hope it is more clear, I apologize for not being clear in the
>>> first place
>>>
>>> Vincent
>>>
>>>
>>>
>>> Eugene Loh wrote:
>>>>
>>>> Vincent Rotival wrote:
>>>>
>>>>> The solution I retained was for the main thread to isend data
>>>>> separately to each other threads that are using Irecv + loop on
>>>>> mpi_test to test the finish of the Irecv. It mught be dirty but
>>>>> works much better than using Bcast
>>>>
>>>> Thanks for the clarification.
>>>>
>>>> But this strikes me more as a question about the MPI standard
>>>> than about the Open MPI implementation. That is, what you really
>>>> want is for the MPI API to support a non-blocking form of
>>>> collectives. You want control to return to the user program
>>>> before the barrier/bcast/etc. operation has completed. That's an
>>>> API change.
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users



  • application/pkcs7-signature attachment: smime.p7s