Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Bad behavior in Allgatherv when a count is 0
From: Tim Mattox (timattox_at_[hidden])
Date: 2008-02-07 13:30:36


Kenneth,
Have you tried the 1.2.5 version? There were some fixes to the
vector collectives that could have resolved your problem.

On Feb 4, 2008 5:36 PM, George Bosilca <bosilca_at_[hidden]> wrote:
> Kenneth,
>
> I cannot replicate this weird behavior with the current version in the
> trunk. I guess it has been fixed since 1.2.4.
>
> Thanks,
> george.
>
>
> On Dec 13, 2007, at 6:58 PM, Moreland, Kenneth wrote:
>
> > I have found that on rare occasion Allgatherv fails to pass the data
> > to
> > all processes. Given some magical combination of receive counts and
> > displacements, one or more processes are missing some or all of some
> > arrays in their receive buffer. A necessary, but not sufficient,
> > condition seems to be that one of the receive counts is 0. Beyond
> > that
> > I have not figured out any real pattern, but the example program
> > listed
> > below demonstrates the failure. I have tried it on OpenMPI version
> > 1.2.3 and 1.2.4; it fails on both. However, it works fine with
> > version
> > 1.1.2, so the problem must have been introduced since then.
> >
> > -Ken
> >
> > **** Kenneth Moreland
> > *** Sandia National Laboratories
> > ***********
> > *** *** *** email: kmorel_at_[hidden]
> > ** *** ** phone: (505) 844-8919
> > *** fax: (505) 845-0833
> >
> >
> >
> > #include <mpi.h>
> >
> > #include <stdlib.h>
> > #include <stdio.h>
> >
> > int main(int argc, char **argv)
> > {
> > int rank;
> > int size;
> > MPI_Comm smallComm;
> > int senddata[5], recvdata[100];
> > int lengths[3], offsets[3];
> > int i, j;
> >
> > MPI_Init(&argc, &argv);
> >
> > MPI_Comm_rank(MPI_COMM_WORLD, &rank);
> > MPI_Comm_size(MPI_COMM_WORLD, &size);
> > if (size != 3)
> > {
> > printf("Need 3 processes.");
> > MPI_Abort(MPI_COMM_WORLD, 1);
> > }
> >
> > for (i = 0; i < 100; i++) recvdata[i] = -1;
> > for (i = 0; i < 5; i++) senddata[i] = rank*10 + i;
> > lengths[0] = 5; lengths[1] = 0; lengths[2] = 5;
> > offsets[0] = 3; offsets[1] = 9; offsets[2] = 10;
> > MPI_Allgatherv(senddata, lengths[rank], MPI_INT,
> > recvdata, lengths, offsets, MPI_INT, MPI_COMM_WORLD);
> >
> > for (i = 0; i < size; i++)
> > {
> > for (j = 0; j < lengths[i]; j++)
> > {
> > if (recvdata[offsets[i]+j] != 10*i+j)
> > {
> > printf("%d: Got bad data from rank %d, index %d: %d\n", rank,
> > i,
> > j,
> > recvdata[offsets[i]+j]);
> > break;
> > }
> > }
> > }
> >
> > MPI_Finalize();
> >
> > return 0;
> > }
> >
> >
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

-- 
Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/
 tmattox_at_[hidden] || timattox_at_[hidden]
    I'm a bright... http://www.the-brights.net/