Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] MPI_Allgatherv error for relative large data on distributed machine, same machine is ok. Narrow down the prob.
From: ryan He (ryan.qing.he_at_[hidden])
Date: 2013-06-24 19:22:15


Dear All,

 I meet a strange problem using MPI_Allgatherv when the send buf size
becomes bigger but not that big.

The following simple testing code of MPI_Allgatherv runs fine when I use
processors on same machine.

However, when I use processors on different machine, I see following
problems on different send buf size.

 1. Bufsize = 2^28, run on 2 processors. OK

2. Bufsize = 2^28, run on 4 processors. Error

[btl_tcp_frag.c:209:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv
error (0xffffffff85f526f8, 2147483592) Bad address(1)

3. Bufsize = 2^28 – 1, run on 4 or 5 or 6 processors. OK

4. Bufsize = 2^29, run on 2 processors. Error shown below OR hang there
with no error shown

[btl_tcp_frag.c:209:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv
error (0xffffffff964605d0, 2147483632) Bad address(1)

5. Bufsize =2^29-1, run on 2 processors. OK

6. Bufsize =2^29-1 , run on 4 processors. hang there with no error shown

 I suspect the error is that when receive buf reaches 2^30,
mca_btl_tcp_frag_recv will have some number close to int limit. However,
this suspect is not well supported by the testcase above. And I have no
clue why there is this limit. The problem may also be in network
configuration since same machine doesn’t have this problem.

 Please take a look at the following test code and help me to solve the
problem.

 #include <stdio.h>

#include <stdlib.h>

#include <string.h>

#include <unistd.h>

#include <time.h>

#include "mpi.h"

 int main(int argc, char ** argv)

{

 int myid,nproc;

long i,j;

long size;

long bufsize;

int *rbuf;

int *sbuf;

char hostname[MPI_MAX_PROCESSOR_NAME];

int len;

 size = (long) 2*1024*1024*1024-1;

 MPI_Init(&argc, &argv);

MPI_Comm_rank(MPI_COMM_WORLD, &myid);

MPI_Comm_size(MPI_COMM_WORLD, &nproc);

MPI_Get_processor_name(hostname, &len);

printf("I am process %d with pid: %d at %s\n",myid,getpid(),hostname);

sleep(2);

 if (myid == 0)

printf("size : %ld\n",size);

sbuf = (int *) calloc(size,sizeof(MPI_INT));

if (sbuf == NULL) {

printf("fail to allocate memory of sbuf\n");

exit(1);

}

rbuf = (int *) calloc(size,sizeof(MPI_INT));

if (rbuf == NULL) {

printf("fail to allocate memory of rbuf\n");

exit(1);

}

 int *recvCount = calloc(nproc,sizeof(int));

int *displ = calloc(nproc,sizeof(int));

 bufsize = 268435456; //which is 2^28

for(i=0;i<nproc;++i) {

recvCount[i] = bufsize;

displ[i] = bufsize*i;

}

 for (i=0;i<bufsize;++i)

sbuf[i] = myid+i;

printf("buffer size: %ld recvCount[0]:%d last displ
index:%d\n",bufsize,recvCount[0],displ[nproc-1]);

fflush(stdout);

 MPI_Allgatherv(sbuf,recvCount[0], MPI_INT,rbuf,recvCount,displ,MPI_INT,

MPI_COMM_WORLD);

 printf("OK\n");

fflush(stdout);

 MPI_Finalize();

return 0;

 }