Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] Bad behavior in Allgatherv when a count is 0
From: Moreland, Kenneth (kmorel_at_[hidden])
Date: 2007-12-13 18:58:34


I have found that on rare occasion Allgatherv fails to pass the data to
all processes. Given some magical combination of receive counts and
displacements, one or more processes are missing some or all of some
arrays in their receive buffer. A necessary, but not sufficient,
condition seems to be that one of the receive counts is 0. Beyond that
I have not figured out any real pattern, but the example program listed
below demonstrates the failure. I have tried it on OpenMPI version
1.2.3 and 1.2.4; it fails on both. However, it works fine with version
1.1.2, so the problem must have been introduced since then.

-Ken

   **** Kenneth Moreland
    *** Sandia National Laboratories
***********
*** *** *** email: kmorel_at_[hidden]
** *** ** phone: (505) 844-8919
    *** fax: (505) 845-0833

#include <mpi.h>

#include <stdlib.h>
#include <stdio.h>

int main(int argc, char **argv)
{
  int rank;
  int size;
  MPI_Comm smallComm;
  int senddata[5], recvdata[100];
  int lengths[3], offsets[3];
  int i, j;

  MPI_Init(&argc, &argv);

  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  MPI_Comm_size(MPI_COMM_WORLD, &size);
  if (size != 3)
    {
    printf("Need 3 processes.");
    MPI_Abort(MPI_COMM_WORLD, 1);
    }

  for (i = 0; i < 100; i++) recvdata[i] = -1;
  for (i = 0; i < 5; i++) senddata[i] = rank*10 + i;
  lengths[0] = 5; lengths[1] = 0; lengths[2] = 5;
  offsets[0] = 3; offsets[1] = 9; offsets[2] = 10;
  MPI_Allgatherv(senddata, lengths[rank], MPI_INT,
                 recvdata, lengths, offsets, MPI_INT, MPI_COMM_WORLD);

  for (i = 0; i < size; i++)
    {
    for (j = 0; j < lengths[i]; j++)
      {
      if (recvdata[offsets[i]+j] != 10*i+j)
        {
        printf("%d: Got bad data from rank %d, index %d: %d\n", rank, i,
j,
               recvdata[offsets[i]+j]);
        break;
        }
      }
    }

  MPI_Finalize();

  return 0;
}