Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] MPI_Allreduce hangs
From: Martin Siegert (siegert_at_[hidden])
Date: 2012-04-23 13:32:19


Hi,

I am debugging a program that hangs in MPI_Allreduce (openmpi-1.4.3).
An strace of one of the processes shows:

Process 10925 attached with 3 threads - interrupt to quit
[pid 10927] poll([{fd=17, events=POLLIN}, {fd=16, events=POLLIN}], 2, -1 <unfini
shed ...>
[pid 10926] select(15, [8 14], [], NULL, NULL <unfinished ...>
[pid 10925] poll([{fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=6, events=PO
LLIN}, {fd=7, events=POLLIN}, {fd=10, events=POLLIN}], 5, 0) = 0 (Timeout)
[pid 10925] poll([{fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=6, events=PO
LLIN}, {fd=7, events=POLLIN}, {fd=10, events=POLLIN}], 5, 0) = 0 (Timeout)
[pid 10925] poll([{fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=6, events=PO
LLIN}, {fd=7, events=POLLIN}, {fd=10, events=POLLIN}], 5, 0) = 0 (Timeout)
[pid 10925] poll([{fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=6, events=PO
LLIN}, {fd=7, events=POLLIN}, {fd=10, events=POLLIN}], 5, 0) = 0 (Timeout)
...

The program is a Fortran program using 64bit integers (compiled with -i8)
and I correspondingly compiled openmpi (version 1.4.3) with -i8 for
the Fortran compiler as well.

The program is somewhat difficult to debug since it takes 3 days to reach
the point where it hangs. This is what I found so far:

MPI_Allreduce is called as

call MPI_Allreduce(MPI_IN_PLACE, recvbuf, count, MPI_DOUBLE_PRECISION, &
                   MPI_SUM, MPI_COMM_WORLD, mpierr)

with count = 455295488. Since the Fortran interface just calls the
C routines in OpenMPI and count variables are 32bit integers in C I started
to wonder what is the largest integer "count" for which a MPI_Allreduce
succeeds. E.g., in MPICH (it has been a while that I looked into this, i.e.,
this may or may not be correct anymore) all send/recv were converted
into send/recv of MPI_BYTE, thus the largest count for doubles was
(2^31-1)/8 = 268435455. Thus, I started to wrap the MPI_Allreduce call
with a myMPI_Allreduce routine that repeatedly calls MPI_Allreduce when
the count is larger than some value maxallreduce (the myMPI_Allreduce.f90
is attached). I have tested the routine with a trivial program that
just fills an array with numbers and calls myMPI_Allreduce and this
test succeeds.
However, with the real program the situations is very strange:
When I set maxallreduce = 268435456, the program hangs at the first call
(iallreduce = 1) to MPI_Allreduce in the do loop

         do iallreduce = 1, nallreduce - 1
            idx = (iallreduce - 1)*length + 1
            call MPI_Allreduce(MPI_IN_PLACE, recvbuf(idx), length, &
                               datatype, op, comm, mpierr)
            if (mpierr /= MPI_SUCCESS) return
         end do

With maxallreduce = 134217728 the first call succeeds, the second hangs.
For maxallreduce = 67108864, the first two calls to MPI_Allreduce complete,
but the third (iallreduce = 3) hangs. For maxallreduce = 8388608 the
17th call hangs, for 1048576 the 138th call hangs; here is a table
(values from gdb attached to process 0 when the program hangs):

maxallreduce iallreduce idx length
268435456 1 1 227647744
134217728 2 113823873 113823872
 67108864 3 130084427 65042213
  8388608 17 137447697 8590481
  1048576 138 143392010 1046657

As if there is (are) some element(s) in the middle of the array with
idx >= 143392010 that cannot be sent or recv'd.

Has anybody seen this kind of behaviour?
Has anybody an idea what could be causing this?
Ideas how to get around this?
Anything that could help would be appreciated ... I already spent a
huge amount of time on this and I am running out of ideas.

Cheers,
Martin

-- 
Martin Siegert
Simon Fraser University
Burnaby, British Columbia
Canada