Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] Question on MPI_Reduce_scatter limit
From: William Au (au_wai_chung_at_[hidden])
Date: 2012-10-19 19:50:15


Dear all,

I am using openmpi 1.6 on linux. I have a question on MPI_Reduce_scatter.

I try to see how large the data can push through MPI_Reduce_scatter using the
following code.

size = (long) 1024*1024*1024*4;
for(k=1;k<=16;++k) {
    bufsize = k*size/16;
    for(i=0;i<nproc;++i)
      recvCount[i] = bufsize/nproc;
    for (i=0;i<bufsize;++i)
      sbuf[i] = myid+1;
    printf("buffer size: %ld recvCount[0]:%d\n",bufsize,recvCount[0]);

    MPI_Reduce_scatter(sbuf,rbuf,recvCount,MPI_LONG,
               MPI_SUM,MPI_COMM_WORLD);
    for(i=0;i<bufsize/nproc;++i) {
      if (rbuf[i] != nproc/2*(nproc+1)) {
    printf("failed in %d",myid);
    break;
      }
    }
   printf("done\n");
  }
  
  ierr = MPI_Finalize();

I used 4 processes and found that if 4 processes are in the same machine. It can
go through size = MAX_INT. However, if 4 processes are in 4 different machines,
it hangs at size= 1073741824.

#0 0x000000337f6d3fc3 in __epoll_wait_nocancel () from /lib64/libc.so.6
#1 0x00002b1e9c45d4eb in epoll_dispatch (base=0xd08e940, arg=0xd08e800,
    tv=<value optimized out>) at epoll.c:215
#2 0x00002b1e9c45f98a in opal_event_base_loop (base=0xd08e940,
    flags=<value optimized out>) at event.c:838
#3 0x00002b1e9c485809 in opal_progress () at runtime/opal_progress.c:189
#4 0x00002b1e9c3ccf05 in opal_condition_wait (req_ptr=0x7fffc4519fb0,
    status=0x0) at ../opal/threads/condition.h:99
#5 ompi_request_wait_completion (req_ptr=0x7fffc4519fb0, status=0x0)
    at ../ompi/request/request.h:377
#6 ompi_request_default_wait (req_ptr=0x7fffc4519fb0, status=0x0)
    at request/req_wait.c:38
#7 0x00002b1ea0d60dda in ompi_coll_tuned_reduce_scatter_intra_ring (
    sbuf=0x7fffc4519fb0, rbuf=0x2b1ea1384010, rcounts=0xd458e30,
    dtype=0x601fa0, op=0x601790, comm=0x601390, module=0xd458a10)
    at coll_tuned_reduce_scatter.c:584
#8 0x00002b1ea0b4cd8c in mca_coll_sync_reduce_scatter (sbuf=0x2b26a1385010,
    rbuf=0x2b1ea1384010, rcounts=<value optimized out>,
    dtype=<value optimized out>, op=<value optimized out>, comm=0x601390,
    module=0xd458820) at coll_sync_reduce_scatter.c:46
#9 0x00002b1e9c3e7e51 in PMPI_Reduce_scatter (sendbuf=0x2b26a1385010,
    recvbuf=0x2b1ea1384010, recvcounts=0xd458e30,
    datatype=<value optimized out>, op=0x601790, comm=0x601390)
---Type <return> to continue, or q <return> to quit---
    at preduce_scatter.c:129
#10 0x0000000000400ddb in main (argc=1, argv=0x7fffc451a998)
    at test_reduce_scatter.c:50

Does openmpi 1.6 uses different mechanisms in reduce_scatter when communicate
within a machine and inter-machines?

What is the limit of size of buffer to use reduce_scatter?

Thanks for your attention.

Regards,

William