Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] System hang-up on MPI_Reduce
From: Ralph Castain (rhc_at_[hidden])
Date: 2009-11-11 08:25:44


You are welcome to stick barriers in - doesn't hurt anything other
than performance.

On Nov 11, 2009, at 3:00 AM, Glembek Ondřej wrote:

> Thanx for your reply...
>
> My coll_sync_priority is set to 50. See the dump of ompi_info --
> param coll sync below...
>
> Does sticking barriers hurt anything or is it just a cosmetic
> thing??? I'm fine with this solution...
>
> Thanx
> Ondrej
>
>
> $ompi_info --param coll sync
> MCA coll: parameter "coll" (current value: <none>,
> data source: default value)
> Default selection set of components for the
> coll framework (<none> means use all components that can be found)
> MCA coll: parameter "coll_base_verbose" (current
> value: "0", data source: default value)
> Verbosity level for the coll framework (0 =
> no verbosity)
> MCA coll: parameter "coll_sync_priority" (current
> value: "50", data source: default value)
> Priority of the sync coll component; only
> relevant if barrier_before or barrier_after is > 0
> MCA coll: parameter
> "coll_sync_barrier_before" (current value: "1000", data source:
> default value)
> Do a synchronization before each Nth
> collective
> MCA coll: parameter
> "coll_sync_barrier_after" (current value: "0", data source: default
> value)
> Do a synchronization after each Nth
> collective
>
>
> Quoting "Ralph Castain" <rhc_at_[hidden]>:
>
>> Yeah, that is "normal". It has to do with unexpected messages.
>>
>> When you have procs running at significantly different speeds, the
>> various operations get far enough out of sync that the memory
>> consumed by recvd messages not yet processed grows too large.
>>
>> Instead of sticking barriers into your code, you can have OMPI do
>> an internal sync after every so many operations to avoid the
>> problem. This is done by enabling the "sync" collective component,
>> and then adjusting the number of operations between forced syncs.
>>
>> Do an "ompi_info --params coll sync" to see the options. Then set
>> the coll_sync_priority to something like 100 and it should work for
>> you.
>>
>> Ralph
>>
>> On Nov 10, 2009, at 2:45 PM, Glembek Ondřej wrote:
>>
>>> Hi,
>>>
>>> I am using MPI_Reduce operation on 122880x400 matrix of doubles.
>>> The parallel job runs on 32 machines, each having different
>>> processor in terms of speed, but the architecture and OS is the
>>> same on all machines (x86_64). The task is a typical map-and-
>>> reduce, i.e. each of the processes collects some data, which is
>>> then summed (MPI_Reduce w. MPI_SUM).
>>>
>>> Having different processors, each of the jobs comes to the
>>> MPI_Reduce in different time.
>>>
>>> The *first problem* came when I called MPI_Reduce on the whole
>>> matrix. The system ended up with *MPI_ERR_OTHER error*, each time
>>> on different rank. I fixed this problem by chunking up the matrix
>>> into 2048 submatrices, calling MPI_Reduce in cycle.
>>>
>>> However *second problem* arose --- MPI_Reduce hangs up... It
>>> apparently gets stuck in some kind of dead-lock or something like
>>> that. It seems that if the processors are of similar speed, the
>>> problem disappears, however I cannot provide this condition all
>>> the time.
>>>
>>> I managed to get rid of the problem (at least after few non-
>>> problematic iterations) by sticking MPI_Barrier before the
>>> MPI_Reduce line.
>>>
>>> The questions are:
>>>
>>> 1) is this a usual behavior???
>>> 2) is there some kind of timeout for MPI_Reduce???
>>> 3) why does MPI_Reduce die on large amount of data if the system
>>> has enough address space (64 bit compilation)
>>>
>>> Thanx
>>> Ondrej Glembek
>>>
>>>
>>> --
>>> Ondrej Glembek, PhD student E-mail: glembek_at_[hidden]
>>> UPGM FIT VUT Brno, L226 Web: http://www.fit.vutbr.cz/
>>> ~glembek
>>> Bozetechova 2, 612 66 Phone: +420 54114-1292
>>> Brno, Czech Republic Fax: +420 54114-1290
>>>
>>> ICQ: 93233896
>>> GPG: C050 A6DC 7291 6776 9B69 BB11 C033 D756 6F33 DE3C
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> --
> Ondrej Glembek, PhD student E-mail: glembek_at_[hidden]
> UPGM FIT VUT Brno, L226 Web: http://www.fit.vutbr.cz/~glembek
> Bozetechova 2, 612 66 Phone: +420 54114-1292
> Brno, Czech Republic Fax: +420 54114-1290
>
> ICQ: 93233896
> GPG: C050 A6DC 7291 6776 9B69 BB11 C033 D756 6F33 DE3C
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users