Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] MPI_Allreduce hangs
From: Brock Palen (brockp_at_[hidden])
Date: 2012-04-24 16:19:31


To throw in my $0.02, though it is worth less.

Were you running this on verb based infiniband?

We see a problem that we have a work around for even with the newest 1.4.5 only on IB, we can reproduce it with IMB. You can find an old thread from me about it. Your problem might not be the same.

Brock Palen
www.umich.edu/~brockp
CAEN Advanced Computing
brockp_at_[hidden]
(734)936-1985

On Apr 24, 2012, at 3:09 PM, Jeffrey Squyres wrote:

> Could you repeat your tests with 1.4.5 and/or 1.5.5?
>
>
> On Apr 23, 2012, at 1:32 PM, Martin Siegert wrote:
>
>> Hi,
>>
>> I am debugging a program that hangs in MPI_Allreduce (openmpi-1.4.3).
>> An strace of one of the processes shows:
>>
>> Process 10925 attached with 3 threads - interrupt to quit
>> [pid 10927] poll([{fd=17, events=POLLIN}, {fd=16, events=POLLIN}], 2, -1 <unfini
>> shed ...>
>> [pid 10926] select(15, [8 14], [], NULL, NULL <unfinished ...>
>> [pid 10925] poll([{fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=6, events=PO
>> LLIN}, {fd=7, events=POLLIN}, {fd=10, events=POLLIN}], 5, 0) = 0 (Timeout)
>> [pid 10925] poll([{fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=6, events=PO
>> LLIN}, {fd=7, events=POLLIN}, {fd=10, events=POLLIN}], 5, 0) = 0 (Timeout)
>> [pid 10925] poll([{fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=6, events=PO
>> LLIN}, {fd=7, events=POLLIN}, {fd=10, events=POLLIN}], 5, 0) = 0 (Timeout)
>> [pid 10925] poll([{fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=6, events=PO
>> LLIN}, {fd=7, events=POLLIN}, {fd=10, events=POLLIN}], 5, 0) = 0 (Timeout)
>> ...
>>
>> The program is a Fortran program using 64bit integers (compiled with -i8)
>> and I correspondingly compiled openmpi (version 1.4.3) with -i8 for
>> the Fortran compiler as well.
>>
>> The program is somewhat difficult to debug since it takes 3 days to reach
>> the point where it hangs. This is what I found so far:
>>
>> MPI_Allreduce is called as
>>
>> call MPI_Allreduce(MPI_IN_PLACE, recvbuf, count, MPI_DOUBLE_PRECISION, &
>> MPI_SUM, MPI_COMM_WORLD, mpierr)
>>
>> with count = 455295488. Since the Fortran interface just calls the
>> C routines in OpenMPI and count variables are 32bit integers in C I started
>> to wonder what is the largest integer "count" for which a MPI_Allreduce
>> succeeds. E.g., in MPICH (it has been a while that I looked into this, i.e.,
>> this may or may not be correct anymore) all send/recv were converted
>> into send/recv of MPI_BYTE, thus the largest count for doubles was
>> (2^31-1)/8 = 268435455. Thus, I started to wrap the MPI_Allreduce call
>> with a myMPI_Allreduce routine that repeatedly calls MPI_Allreduce when
>> the count is larger than some value maxallreduce (the myMPI_Allreduce.f90
>> is attached). I have tested the routine with a trivial program that
>> just fills an array with numbers and calls myMPI_Allreduce and this
>> test succeeds.
>> However, with the real program the situations is very strange:
>> When I set maxallreduce = 268435456, the program hangs at the first call
>> (iallreduce = 1) to MPI_Allreduce in the do loop
>>
>> do iallreduce = 1, nallreduce - 1
>> idx = (iallreduce - 1)*length + 1
>> call MPI_Allreduce(MPI_IN_PLACE, recvbuf(idx), length, &
>> datatype, op, comm, mpierr)
>> if (mpierr /= MPI_SUCCESS) return
>> end do
>>
>> With maxallreduce = 134217728 the first call succeeds, the second hangs.
>> For maxallreduce = 67108864, the first two calls to MPI_Allreduce complete,
>> but the third (iallreduce = 3) hangs. For maxallreduce = 8388608 the
>> 17th call hangs, for 1048576 the 138th call hangs; here is a table
>> (values from gdb attached to process 0 when the program hangs):
>>
>> maxallreduce iallreduce idx length
>> 268435456 1 1 227647744
>> 134217728 2 113823873 113823872
>> 67108864 3 130084427 65042213
>> 8388608 17 137447697 8590481
>> 1048576 138 143392010 1046657
>>
>> As if there is (are) some element(s) in the middle of the array with
>> idx >= 143392010 that cannot be sent or recv'd.
>>
>> Has anybody seen this kind of behaviour?
>> Has anybody an idea what could be causing this?
>> Ideas how to get around this?
>> Anything that could help would be appreciated ... I already spent a
>> huge amount of time on this and I am running out of ideas.
>>
>> Cheers,
>> Martin
>>
>> --
>> Martin Siegert
>> Simon Fraser University
>> Burnaby, British Columbia
>> Canada
>> <myMPI_Allreduce.f90>_______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users